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Abstract. We give a unified treatment of the limit, as the size tends 
to infinity, of simply generated random trees, including both the well- 
known result in the standard case of critical Galton- Watson trees and 
similar but less well-known results in the other cases (i.e., when no 
equivalent critical Galton- Watson tree exists). There is a well-defined 
limit in the form of an infinite random tree in all cases; for critical 
Galton- Watson trees this tree is locally finite but for the other cases the 
random limit has exactly one node of infinite degree. 

The proofs use a well-known connection to a random allocation model 
that we call balls-in-boxes, and we prove corresponding theorems for this 
model. 

This survey paper contains many known results from many different 
sources, together with some new results. 
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1. Introduction 

The main purpose of this survey paper is to study the asymptotic shape 
of simply generated random trees in complete generality; this includes con- 
ditioned Galton-Watson trees as a special case, but we will also go beyond 
that case. Definitions are given in Section [21 here we only recall that simply 
generated trees are defined by a weight sequence (wk), and that the case 
when the weight sequence is a probability distribution yields conditioned 
Galton-Watson trees. 

It is well-known that in the case of a critical conditioned Galton-Watson 
tree, i.e., when the defining offspring distribution has expectation 1, the 
random tree has a limit (as the size tends to infinity); this limit is an in- 
finite random tree, the size-biased Galton-Watson tree defined by Kesten 



7J], see also Aldous U], Aldous and Pitman [6[ and Lyons, Pemantle and 



Peres 8j]. It is also well-known that this case is less special than it might 
seem; there is a notion of equivalent weight sequences defining the same 
simply generated random tree, see Section [H and a large class of weight 
sequences have an equivalent probability weight sequence defining a critical 
conditioned Galton-Watson tree. Many probabilists, including myself, have 
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often concentrated on this "standard" case of critical conditioned Galton- 
Watson trees and dismissed the remaining cases as uninteresting exceptional 
cases. However, some researchers, in particular mathematical physicists, 
have studied such cases too. Bialas and Burda |13] studied one case (Exam- 
ple EiT] below) and found a phase transition as we leave the standard case; 
this can be interpreted as a condensation making the tree bushy with one or 
a few nodes of very high degree. This interesting condensation was studied 
further by Jonsson and Stefansson [67], who showed that (in the power-law 
case), there is a limit tree of a different type, having one node of infinite 
degree. 

We give in the present paper a unified treatment of the limit as the size 
tends to infinity for all simply generated trees, including both the well- 
known result in the standard case of critical Galton-Watson trees and the 
"exceptional" cases (i.e., when no equivalent probability weight sequence 
exists, or when such a sequence exists but not with mean 1). We will see 
that there is a well-defined limit in the form of an infinite random tree for 
any weight sequence. In the non-standard cases, this infinite random limit 
has exactly one node of infinite degree, so its form differs from the standard 
case of a critical Galton-Watson tree where all nodes in the limit tree have 
finite degrees, but nevertheless the trees are similar; see Sections [5] and [7] for 
details. 

Some important notation, used throughout the paper, is introduced in 
Section [3l while Sections H] and [6] contain further preliminaries. The main 
limit theorem for simply generated random trees is stated in Section [TJ to- 
gether with some other, related, limit theorems concerning node degrees and 
fringe subtrees. The differences between different types of weight sequences 
are discussed further in Section El 

The proofs of the limit theorems for random trees use a well-known con- 
nection to a random allocation model that we call balls-in-boxes; this model 
exhibits a similar behaviour, with condensation in the non-classical cases, 
see e.g. Bialas, Burda and Johnston ij]. The model is defined in Section [TOl 
and the relation between the models is described in Section [T4l The balls-in- 
boxes model is interesting in its own right, and it has been used for several 
other applications; we give some examples from probability theory, com- 
binatorics and statistical physics in Section [TTl We therefore also develop 
the general theory for balls-in-boxes with arbitrary weight sequences (in the 
range where the mean occupancy is bounded). In particular, we give in Sec- 
tion [To] theorems corresponding to (and in some ways extending) our main 
theorems for random trees. 

The limit theorems for balls-in-boxes are proved in Sections [T2VI131 and 
then these results are used to prove the limit theorems for random trees in 
Sections EHU 

The remaining sections contain additional results. Section [T7] gives as- 
ymptotic results for the partition functions of the models. The very long 
Section [18] gives results on the largest degrees in random trees, and the 
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largest numbers of balls in a box in the balls-in-boxes model; the section 
is long because there are several different cases with different types of be- 
haviour. In particular, we study in Section 118.61 the case when there is 
condensation, and investigate whether this appears as condensation to a 
single box (or node), or whether the condensation is distributed over several 
boxes (nodes); it turns out that both cases can occur. We give also, in Sec- 
tion 118.71 applications to the size of the largest tree in random forests. In 
Section [HI the condensation in random trees is discussed in further detail. 
Finally, some additional comments, results and open problems are given in 
Sections [20] and [HJ Section [20] mentions briefly various other types of as- 
ymptotic results for simply generated random trees, and Section [2T] discusses 
alternative ways to condition Galton-Watson trees. 

This paper contains many known results from many different sources, 
together with some new results. (We believe, for example, that the theorems 
in Section [7] are new in the present generality.) We have tried to give relevant 
references, but the absence of references does not necessarily imply that a 
result is new. 



2. Simply generated trees 

2.1. Ordered rooted trees. The trees that we consider are (with a few ex- 
plicit exceptions) rooted and ordered (such trees are also called plane trees) . 
Recall that a tree is rooted if one node is distinguished as the root o; this im- 
plies that we can arrange the nodes in a sequence of generations (or levels), 
where generation x consists of all nodes of distance x to the root. (Thus 
generation is the root; generation 1 is the set of neighbours of the root, 
and so on.) If t; is a node with v ^ o, then the parent of v is the neighbour 
of V on the path from v to o; thus, every node except the root has a unique 
parent, while the root has no parent. Conversely, for any node v, the neigh- 
bours of V that are further away from the root than v are the children of v. 
The number of children of v is the outdegree d~^{v) ^ of u. Note that if v 
is in generation x, then its parent is in generation x — 1 and its children are 
in generation x + 1. 

Recall further that a rooted tree is ordered if the children of each node are 
ordered in a sequence vi, . . . ,Vd, where d = d'^{v) ^0 is the outdegree of v. 



See e.g. Drmota j33i | for more information on these and other types of trees. 
(The trees we consider are called planted plane trees in [s^.) We identify 
trees that are isomorphic in the obvious (order preserving) way. (Formally, 
we can define our trees as equivalence classes. Alternatively, we may select 
a specific representative in each equivalence class as in Section [6)) 

Remark 2.1. Some authors prefer to add an extra (phantom) node as a 
parent of the root; such trees are called planted. (An alternative version is 
to add only a pendant edge at the root, with no second endpoint.) There 
is an obvious one-to-one correspondence between trees with and without 
the extra node, so the difference is just a matter of formulations, but when 
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comparing results one should be careful whether, for example, the extra 
node is counted or not. The extra node yields the technical advantage that 
also the root has indegree 1 and thus total degree = 1 + d'^{v); it further 
gives each embedding in the plane a unique ordering of the children of every 
node (in clockwise order from the parent, say). Nevertheless, we find this 
device less natural and we will not use it in the present paper. (We use 
outdegrees instead of degrees and assume that an ordering of the children 
as above is given; then there are no problems.) 

We are primarily interested in (large) finite trees, but we will also con- 
sider infinite trees, for example as limit objects in our main theorem (Theo- 
rem ET]). The infinite trees may have nodes with infinite outdegree d'^{v) = 
oo; in this case we assume that the children are ordered vi,V2, ■ ■ ■ (i.e., the 
order type of the set of children is N) . 

We let T„ be the set of all ordered rooted trees with n nodes (including 
the root) and let Tf := U^i be the set of all finite ordered rooted trees; 
see further Section [6l 

Remark 2.2. Note that T„ is a finite set. In fact, it is well-known that its 
size |Tn| is the (n — l):th Catalan number 



see e.g. [H, Section 1.2.2 and Theorem 3.2], [13, Section 1.2.3] or fw^ . 
Exercise 6.19(e)], but we do not need this. 

For any tree T, we let \T\ denote the number of nodes; we call \T\ the 
size of T. As is well known, for any finite tree T, 



since every node except the root is the child of exactly one node. 

2.2. Galton— Watson trees. An important class of examples of random 
ordered rooted trees is given by the Galton-Watson trees. These are de- 
fined as the family trees of Galton-Watson processes: Given a probability 
distribution (vrfc)^Q on Z^q, or, equivalently, a random variable with dis- 
tribution (7rfc)^g, we build the tree T recursively, starting with the root 
and giving each node a number of children that is an independent copy 
of ^. (We call (vrfc)^Q the offspring distribution of T; we sometimes also 
abuse the language and call the offspring distribution.) In other words, 
the outdegrees d~^{v) are i.i.d. with the distribution (vrfc)^Q. 

Recall that the Galton-Watson process is called suhcritical, critical or 
supercritical as the expected number of children E ^ = X^fcLo ^'""fc satisfies 
E^<l,E^ = lorE^>l. It is a standard basic fact of branching process 
theory that Tis finite a.s. if E,^ ^ 1 (i.e., in the subcritical and critical cases), 
but T is infinite with positive probability if E ^ > 1 (the supercritical case) , 
see e.g. Athreya and Ney @]. 




(2.1) 




(2.2) 
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The Galton- Watson trees have random sizes. We are mainly interested 
in random trees with a given size; we thus define Tn as T conditioned on 
IT] = n. These random trees Tn are called conditioned Galton-Watson trees. 
By definition, Tn has size \Tn\ = n. 

It is well-known that several important classes of random trees can be 
seen as conditioned Galton-Watson tree, see e.g. Aldous 0], Devroye [s^ ]. 



Drmota (33| and Section [H 



2.3. Simply generated trees. The random trees that we will study are a 
generalization of the Galton-Watson trees. We suppose in this paper that we 
are given a fixed weight sequence w = {wk)k^o of non-negative real numbers. 
We then define the weight of a finite tree T S Tf by 

w{T) :='[lwd+(v), (2-3) 

taking the product over all nodes v in T. Trees with such weights are called 
simply generated trees and were introduced by Meir and Moon [s^. To avoid 
trivialities, we assume that wq > and that there exists some k ^ 2 with 
Wk > 0. 

We let Tn be the random tree obtained by picking an element of at 
random with probability proportional to its weight, i.e., 

nTn = T) = T E T„, (2.4) 

^n 

where the normalizing factor Z„ is given by 

Zn = Z„(w) := ^i^)- (2.5) 

Zn is known as the partition function. This definition makes sense only 
when Zn > 0; we tacitly consider only such n when we discuss Tn- Our 
assumptions wq > and Wk > for some k ^ 2 imply that Zn > for 
infinitely many n, see Corollary 114.61 for a more precise result. (In most 
applications, wi > 0, and then Zn > for every n ^ 1, so there is no 
problem at all. The archetypical example with a parity restriction is given 
by the random (full) binary tree, see Example 19.31 for which Z„ > if and 
only if n is odd.) 

One particularly important case is when '^'^^^Wk = 1, so the weight 
sequence (wk) is a probability distribution on Z^o- (We then say that (wk) 
is a probability weight sequence.) In this case we let ^ be a random variable 
with the corresponding distribution: P(^ = k) = Wk', we further let T be 
the random Galton-Watson tree generated by ^. It follows directly from the 
definitions that for every finite tree T G Tf , P(7' = T) = w(T). Hence 

Zn = n\T\ = n) (2.6) 

and the simply generated random tree Tn is the same as the random Galton- 
Watson tree T conditioned on \T\ = n, i.e., it equals the conditioned Galton- 
Watson tree Tn defined above. 
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It is well-known, see Section H] for details, that in many cases it is possible 
to change the weight sequence (wk) to a probability weight sequence without 
changing the distribution of the random trees 7^; in this case Tn can thus be 
seen as a conditioned Galton- Watson tree. Moreover, in many cases this can 
be done such that the resulting probability distribution has mean 1. In such 
cases it thus suffices to consider the case of a probability weight sequence 
with mean E,^ = 1; then 7^ is a conditional critical Galton- Watson tree. It 
turns out that this is a nice and natural setting, with many known results 
proved by many different authors. (In many papers it is further assumed 
that ^ has finite variance, or even a finite exponential moment. This is not 
needed for the main results presented here, but may be necessary for other 
results. See also Sections [HI [T8l and [20l) 

3. Notation 

We consider a fixed weight sequence w = {'Wk)k^o- The support supp(w) 
of the weight sequence w = (wk) is {k : Wk > 0}. We define 

u = w(w) := supsupp(w) = sup{A; : > 0} ^ oo, (3-1) 

(When considering Tn, we assume, as said above, wq > and Wk > for 
some /c ^ 2; this can be written G supp(w) and uj ^ 2.) 

We further define (assuming that the support contains at least two points) 

span(w) := maxjd 1 : d \ {i — j) whenever Wi,Wj > 0}. (3-2) 

Since we assume wq > 0, i.e., G supp(w), we can simplify this to 

span(w) = max{(i ^ 1 : d \ i whenever Wi > 0}, (3-3) 

the greatest common divisor of supp(w). 
We let 

oo 

^z):=^Wkz' (3.4) 

fc=0 

be the generating function of the given weight sequence, and let p G [0, oo] 
be its radius of convergence. Thus 

p = 1/ lim sup w^'''^ . (3-5) 

A;— ^-oo 

$(/)) is always defined, with < <&(p) ^ oo. Note that (assuming u > 0) 
$(oo) = oo; in particular, if p = oo, then ^{p) = oo. On the other hand, 
if p < oo, then both $(p) = oo and $(p) < oo are possible. If p > 0, then 
$(t) <I>(p) as t p by monotone convergence. 
We further define, for t such that $(t) < oo, 

'^{t) is thus defined and finite at least for ^ t < p, and if <I*(p) < oo, 
then ^'(p) is still defined by (j3.6p . with ^'(p) ^ oo (note that the numerator 
in (|3.6p may diverge in this case, but not for ^ t < p). Moreover, if 
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^{p) = OO, we define ^(p) := linif/ip ^'(t) ^ oo. (The limit exists by 
Lemma l3.]|(i)| below, but may be infinite.) 
Alternatively, ()3.6p may be written 

^(e^) = e^^ = -log$(e^). (3.7) 

The function ^ will play a central role in the sequel. This is mainly 
because of Lemma 14.21 below, which gives a probabilistic interpretation of 
^{t). Its basic properties are given by the following lemma, which is proved 
in Section [T2j 

Lemma 3.1. Let w = (u'fc)^Q be a given weight sequence with wq > and 
Wk > for some k ^ 1 (i.e., uj{w) > 0). 

(i) If < p ^ oo, then the function 

is finite, continuous and (strictly) increasing on [0,p), with^{0) = 0. 

(ii) If < p ^ oo, then ^{t) ^{p) ^ oo as t p. 

(iii) For any p, ^ is continuous [0,p\ — )• [0, oo], with ^(p) ^ oo. 

(iv) If p < CO and $(/)) = oo, then ^(p) := limt_>p ^'(t) = oo. 

(v) If p = oo, then ^(/o) := limt_>p ^'(t) = u ^ oo. 
Consequently, if p > 0, then 

^'(p) = lim^'(t) = sup *(t) G (0,oo]. (3.9) 

We define 

j/:=1'(/9). (3.10) 

In particular, if <I>(p) < oo, then 

,3.n, 

It follows from Lemma |3. II that 1^ = <;=^ P = 0, and that if p > 0, then 
u := ^'(p) = lim^'(t) = sup ^'(t) G (0,oo]. (3.12) 

It follows from (j3.8p that ^ cj. 

Note that all these parameters depend on the weight sequence w = (wk)', 
we may occasionally write e.g. Ljj{w) and i^(w), but usually we for simplicity 
do not show w explicitly in the notation. 

Remark 3.2. Let Z{z) denote the generating function Z{z) := Yl'm=i ^nz"'. 
Then 

Z{z) = z^{Z{z)), (3.13) 
as shown already by Otter [93|. This equation is the basis of much work 
on simply generated trees using algebraic and analytic methods, see e.g. 
Drmota [33)], but the present paper uses different methods and we will use 
(j3.13p only in a few minor remarks. 
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3.1. More notation. We define No = Z^o := {0,1,2,...}, Ni = Z>o := 
{1, 2, . . . }, No := No U {oo} and Ni := Ni U {oo}. 

All unspecified limits are as n — )• oo. Thus, a„ ~ bn means Un/bn — )• 1 as 

n — )• oo. We use and for convergence in probability and distribution, 

respectively, of random variables, and = for equality in distribution. We use 
Op and Op in the standard senses: Op(a„) is an unspecified random variable 
Xn such that Xn/cin — ^ as n — )• oo, and Op(a„) is a random variable Xn 
such that Xn/un is stochastically bounded (usually called tight). We say 
that some event holds w.h.p. (with high probability) if its probability tends 
to 1 as n — )• oo. (See further e.g. [62i].) 

A coupling of two random variables X and Y is formally a pair of random 

variables X' and Y' defined on a common probability space such that X = 

X' and Y = Y'; with a slight abuse of notation we may continue to write X 
and Y, thus replacing the original variables with new ones having the same 
distributions. 

We write X„ ~ X'^ for two sequences of random variables or vectors Xn 
and X'j^ if there exists a coupling of X„ and with Xn = X'n w.h.p.; this 
is equivalent to dTY{Xn, X'n) — as n — oo, where d^y denotes the total 
variation distance. 

We use Ci, C2, . . . to denote unimportant constants, possibly different at 
different occurrences. 

Recall that d'^{v) = (ij(t;) always denotes the outdegree of a node t; in a 
tree T. (We use the notation d'^{v) rather than d{v) to emphasise this.) We 
will not use the total degree d{v) = 1 + d'^{v) (when f 7^ o), but care should 
be taken when comparing with other papers. 

4. Equivalent weights 
If a, 6 > and we change Wk to 

Wk ■= ab'^Wk, 

then, for every tree T G 1^, w{T) is changed to, using (|2.2p . 

w{T) = a'^b'^^^^^'^^wiT) = oJ^b'^-^wiT). 

Consequently, Zn is changed to 

Zn := a"6"-iZ„, (4.3) 

and the probabilities in (12. 4p are not changed. In other words, the new 
weight sequence {ivk) defines the same simp ly g enerated random trees Tn as 
(wk)- (This is essentially due to Kennedy [731], who did not consider trees 
but showed the corresponding result for Galton-Watson processes. See also 
Aldous [3].) We say that weight sequence (wk) and (wk) related by (|4.ip 
(for some a,b > 0) are equivalent. (This is clearly an equivalence relation 
on the set of weight sequences.) 



(4.1) 
(4.2) 
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Let us see how replacing {wk) by the equivalent weight sequence (wk) 
affects the parameters defined above. The support, span and uj are not 
affected at all. 

The generating function <I>(t) is replaced by 

oo oo 

:= Wkt^ = abh^ = a^{bt), (4.4) 
k=0 k=0 

with radius of convergence p = p/b. Further, ^{t) is replaced by 

~, , m'(t) tab^'ibt) ^ , 
^(t) := = / = ^ibt). (4.5) 

^ ' $(t) a^{bt) ' ^ ' 

Hence, if p > 0, u is replaced by, using (j3.12p . 

z? := sup $ (t) = sup ^{bt) = sup ^{s) = v\ 

if p = then t' = p = = z^is trivial. In other words, z/ is invariant and 
depends only on the equivalence class of the weight sequence. 

Lemma 4.1. There exists a probability weight sequence equivalent to (wk) 
if and only if and only if p > 0. In this case, the probability weight sequences 
equivalent to (wk) are given by 

Pk = (4.6) 
for any t > such that ^{t) < oo. 

Proof. The equivalent weight sequence (-5;^) given by (j4.ip is a probability 
distribution if and only if 



l = Y'^k = aY^kb'' = a$(5), 



k=0 k=0 

i.e., if and only if ^{b) < oo and a = $(6)^^. Thus, there exists a probability 
weight sequence equivalent to (wk) if and only if there exists 6 > with 
$(6) < oo, i.e., if and only if p > 0; in this case we can choose any such b 
and take a := $(6)~^, which yields (14. 6p (with t = b). □ 

We easily find the probability generating function and thus moments of 
the probability weight sequence in (14. 6p : we state this in a form including 
the trivial case t = 0. 

Lemma 4.2. If t ^ and ^(t) < oo, then 



t^Wk 



Pk ■■= k ^ 0, (4.7) 
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defines a probability weight sequence (pk)- This probability distribution has 
probability generating function 

Mz) := Ep.^^ = (4.8) 
fc=0 ^ ^ 

and a random variable with this distribution has expectation 

t^'(t) 

IEe = ^;(l) = ^ = ^(t) (4.9) 

and variance 

VarC = t^'(t); (4.10) 
furthermore, for any s ^ and x ^ 0, 

^(e'^t) ^(e^t) 

1ft < p, then E ^ and Var ^ are finite. Ift = p, however, E and Var ^ may 
be infinite (we define Var^ = oo when E^ = cx), but Var^ may be infinite 
also when E^ is finite); I^M)-<^iM stiU hold, with *'(p) ^ oo defined as the 
limit lim^/^p ^''(s). The tail estimate ()4.1ip is interesting only when t < p, 
when we may choose any s < log{p/t) and obtain the estimate 0{e~'^^). 

Proof. Direct summations yield 
and, more generally, 

f .^E£o^^£M (4 13) 

showing that {pk) is a probability distribution with the probability generat- 
ing function given in ()4.8p . 

The expectation E^ = $'^(1) is evaluated by differentiating (|4.8p (for 
z < 1 and then taking the limit as z — )• 1 to avoid convergence problems if 
t = p), or directly from (j4.7p as 

^^ = f:kpk = ^^^ = m. 

k=0 ^ ' 

Similarly, the variance is given by, using (j4.8p and (j4.9p . 
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Alternatively, 



k=0 ^k=0 ^ 



(In the case t = p and Var^ = oo, we use this calculation for t' < t and let 
t' ^ t.) 

Finally, by (OH . 



In particular, taking t = 1, we recover that standard facts that if (wk) is a 
probability distribution, so ^{1) = 1, then it has expectation ^'{1) = ^(1) 
and variance ^''(1). 

Remark 4.3. We see from Lemma [4.1l that the probability weight sequences 
equivalent to (wk) are given by ()4.6|) . where t G (0, when ^{p) < oo 
and t £ (0,p) when ^{p) = oo. By Lemma 13.11 t i— )• = ^{t) is an 
increasing bijection (0, p] — (0, v] and (0, p) — )■ (0, u). Hence, any equivalent 
probability weight sequence is uniquely determined by its expectation, and 
the possible expectations are (0, u] (when ^{p) < oo) or (0, u) (when ^{p) = 
oo). 

Remark 4.4. Note that we will frequently use (j4.6p to define a new prob- 
ability weight sequence also if we start with a probability weight sequence 
{wk)- Probability distributions related in this way are called conjugated or 
tilted. Conjugate distributions were introduced by Cramer |27i] as an impor- 



tant tool in large deviation theory, see e.g. 3l[. The reason is essentially 
the same as in the present paper: by conjugating the distribution we can 
change its mean in a way that enables us to keep control over sums Sn.. 



5. A MODIFIED Galton- Watson tree 

Let (vrfc)fc^o be a probability distribution on No and let be a random 
variable on No with distribution (vrfc)^g: 

Fi^ = k) = 7Tk, k = 0,1,2,... (5.1) 

We assume that the expectation /i := = kirk ^ 1 (the subcritical or 
critical case). 

In this case, we define (based on Kesten [t^] and Jonsson and Stefansson 
[H?! ) a modified Galton- Watson tree T as follows: There are two types of 
nodes: normal and special, with the root being special. Normal nodes have 
offspring (outdegree) according to independent copies of ^, while special 
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nodes have offspring according to independent copies of ^, where 

m = k):={ ' ' ' ' (5.2) 

\1 — fi, k = oo. 

(Note that this is a probabihty distribution on Ni.) Moreover, ah children 
of a normal node are normal; when a special node gets an infinite number of 
children, all are normal; when a special node gets a finite number of children, 
one of its children is selected uniformly at random and is special, while all 
other children are normal. 

Thus, for a special node, and any integers j, k with 1 ^ j ^ < oo, the 
probability that the node has exactly k children and that the j:th of them 
is special is kiTk/k = vr^. 

Since each special node has at most one special child, the special nodes 
form a path from the root; we call this path the spine of T. We distinguish 
two different cases: 

(Tl) If ;U = 1 (the critical case), then ^ < oo a.s. so each special node has 
a special child and the spine is an infinite path. Each outdegree d'^{v) 
in T is finite, so the tree is infinite but locally finite. 

In this case, the distribution of ^ in (j5.2p is the size-biased dis- 
tribution of £, and T is the size-biased Galton-Watson tree defined 



by Kesten [7j], see also Aldous Aldous and Pitman [6], Lyons, 



Pemantle and Peres 8j] and Remark 15.71 below. The underlying size- 
bi ased Galton- Wats on process is the same as the Q-process studied 
in Athreva and Nev [1, Section 1.14], which is an instance of Doob's 



/i-transform. (See Lyons, Pemantle and Peres |84i | for further related 
constructions in other contexts and Geiger and Kauffmann 45|] for a 
generalization.) 

An alternative construction of the random tree T is to start with 
the spine (an infinite path from the root) and then at each node in the 
spine attach further branches; the number of branches at each node in 
the spine is a copy of ^ — 1 and each branch is a copy of the Galton- 
Watson tree T with offspring distributed as ^; furthermore, at a node 
where k new branches are attached, the number of them attached to 
the left of the spine is uniformly distributed on {0, . . . ,k}. (All random 
choices are independent.) Since the critical Galton-Watson tree T is 
a.s. finite, it follows that T a.s. has exactly one infinite path from the 
root, viz. the spine. 
(T2) If ^ < 1 (the subcritical case), then a special node has with probability 
1 — ;U no special child. Hence, the spine is a.s. finite and the number L 
of nodes in the spine has a (shifted) geometric distribution Ge(l — /i), 

P(L = ^) = (1 - /i)/-\ £ = 1,2,.... (5.3) 

The tree T has a.s. exactly one node with infinite outdegree, viz. the 
top of the spine. T has a.s. no infinite path. 



14 



SVANTE JANSON 



In this case, an alternative construction of T is to start with a spine 
of random length L, where L has the geometric distribution ()5.3p . We 
attach as in (Tl) further branches that are independent copies of the 
Galton-Watson tree T; at the top of the spine we attach an infinite 
number of branches and at all other nodes in the spine the number 

we attach is a copy of ^* — 1 where = I C < c«) has the size- 
biased distribution P(^* = k) = kiTk/fJ.- The spine thus ends with an 
explosion producing an infinite number of branches, and this is the 
only node with an infinite degree. This is the construction by Jonsson 
and Stefansson 167 



Example 5.1. In the extreme case ^ = 0, or equivalently ^ = a.s., i.e., 
ttq = 1 and vr^ = for k ^ 1, (15. 2p shows that ^ = oo a.s. Hence, every 
normal node has no child and is thus a leaf, while every special node has an 
infinite number of children, all normal. Consequently, the root is the only 
special node, the spine consists of the root only (i.e., its length L = 1), and 
the tree T consists of the root with an infinite number of leaves attached to 
it, i.e., T is an infinite star. (This is also given directly by the alternative 
construction in (T2) above.) In contrast, T consists of the root only, so 
\T\ = 1. In this case there is no randomness in T or T. 

Remark 5.2. In case (Tl), if we remove the spine, we obtain a random 
forest that can be regarded as coming from a Galton- Watson process with 
immigration, where the immigration is described by an i.i.d. sequence of 
random variables with the distribution of ^ — 1, see Lyons, Pemantle and 
Peres [sj]. (In the Poisson case, Grimmett [4?!] gave a slightly different 
description of T using a Galton- Watson process with immigration.) 

In case (T2), we can do the same, but now the immigration is different: 
at a random (geometric) time, there is an infinite immigration, and after 
that there is no more immigration at all. 

Remark 5.3. Some related modifications of Galton-Watson trees hay ing a 
finite spine have been considered previously. Sagitov and Serra 102l | con- 



struct (as a limit for a certain two-type branching process) a random tree 
similar to the one in (T2) above (with a subcritical ^), with a finite spine 
having a length with the geometric distribution (15. 3|) : the difference is that 
at the top of the spine, only a finite number of Galton-Watson trees T are 
attached. (This number may be a copy of i^* — 1 as a t th e other points of 
the spine, or it may have a different distribution, see 102l |.) Thus there is 



no explosion, and the tree is finite. Another modified Galton-Watson tree is 
used by Addario-Berry, Devroye and Janson [l[; the proofs use a truncated 
version of (Tl) above (with a critical ^), where the spine has a fixed length 
k; at the top of the spine the special node becomes normal and reproduces 
normally with ^ children. Geiger [44] studied T conditioned on its height 
being at least n, see Section [211 and gave a construction of it using a spine 
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of length n, but with more comphcated rules for the branches. See also the 
modified trees Tin, Tin, Ts™ in Section [T9l 

The invariant random sin-tree constructed by Aldous (2| in a more general 
situation, is for a critical Galton-Watson process another related tree; it has 
an infinite spine as T, but differs from T in that the root has ^ + 1 children 
(and thus ^ normal children) instead of ^. In this case, it may be better 
to reverse the orientation of the spine and consider the spine as an infinite 
path • • • V-2V-1VQ starting at —00 (there is thus no root); we attach further 
branches (copies of T) as above, with all Vi, i < 0, special (the number of 
children is a copy of but the top node vq normal (the number of children 
is a copy of ^, and all are normal). 



Kurtz, Lyons, Pemantle and Peres [TSj] and Chassaing and Durhuus [2 
have constructed related trees with infinite spines using multi-type Galton- 
Watson processs. 

Remark 5.4. If ^ has the probability generating function (p{x) := Ex^ = 
^^QTTfcx'^, then ^ has by (j5.2p the probability generating function 

^ 00 

Ex^ = '^kTTkx'' = xip'{x), (5.4) 

k=0 

at least for ^ x < 1. (Also for /i < 1 when ^ may take the value 00.) 

Remark 5.5. In case (Tl), the random variable ^ is a.s. finite and has mean 

00 00 
= ^ A;P(^= k) = J2k^-^k = = 0-^ + 1, (5.5) 

fc=0 k=0 

where cr^ := Var^ ^ c«. In case (T2), we have P(^ = 00) > and thus 
E^ = 00. This suggests that in results that are known in the critical case 
(Tl), and where o"^ appears as a parameter (see e.g. Section [20l) . the correct 
generalization of o"^ to the subcritical case (T2) is not Var ^ but E ^ — 1 = 00. 
(See Remark 15.61 below for a simple example.) We thus define, for any 
distribution (vrfc)^Q with expectation ;U ^ 1, 

a2:=E^-l = ^ = (5.6) 

I 00, fj, <1. 

Remark 5.6. Let IkiT) denote the number of nodes with distance k to 
the root in a rooted tree T. (This is thus the size of the k:th generation.) 
Trivially, lo{T) = 1, while h{T) = d^{o), the root degree. 

It follows by the construction of T and induction that in case (Tl), using 

(USD, 

Elk{f) = l + k{E(^-l) = ka^ + l, A; ^ 0. (5.7) 
In case (T2), we have if > and k ^ 1 a positive probability that L = k 
and then IkiT) = 00. Thus Elk{T) = 00. Consequently, using (|5.6p . if 
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< /X ^ 1, then 

Elkif) = ka^ + 1, k^l. (5.8) 

However, this fails if ;U = 0; in that case, hiT) = oo but lk{T) = for k ^ 2, 
see Example 15. 11 

Remark 5.7. As said above, in the case = 1, the tree T is the size- 
biased Gallon- Watson tree, see [t^], ^ and fsl]. For comparison, we give 
the definition of the latter, for an arbitrary distribution {TTk)k^o with finite 
mean > 0: Let, as above, ^ have the distribution (vr^), see (j5.ip . and let 
have the size-biased distribution defined by 

]P(^* = k) = ^, A: = 0,1, 2,... (5.9) 

(Note that this is a probability distribution on Ni.) Construct T* as T 
above, with normal and special nodes, with the only difference that the 
number of children of a special node has the distribution of in (|5.9p . 

In the critical case ^ = 1, we have ^* = ^ and thus T* = T, but in 
the subcritical case ^ < 1, T* and T are clearly different. (Note that T* 
always is locally finite, but T is not when /i < 1.) When /x > 1, 7~ is not 
even defined, but T* is. (As remarked by Aldous and Pitman [6], in the 
supercritical case T* has a.s. an uncountable number of infinite paths from 
the root, in contrast to the case /x ^ 1 when the spine a.s. is the only one.) 

T* can also be constructed by the alternative construction in (Tl) above 
starting with an infinite spine, again with the difference that .^ — 1 is replaced 
by ^* — 1. T* can also be seen as a Gallon- Watson process with immigration 
in the same way as in Remark 15.21 

By ()5.9p . the probability that a given special node in T* has k ^ I 
children, with a given one of them special, is 

lr{C = k) = ^ = ^. (5.10) 

k kji fi 

Let T be a fixed tree of height ^, and let u be a node in the £:th (and 
last) generation in T. Let T"**-^-* denote T* truncated at height i. It follows 
from (j5.10p and independence that the probability that T*^^^ = T and that 
u is special (i.e., u is the unique element of the spine at distance i from the 
root) equals /i~^P(7'''^^ = T). Hence, summing over the h{T) possible u, 

p(r*w = T) = ii-%{T) p(r(^) = T), (5.11) 

which explains the name size-biased Gallon- Watson tree. (As an alternative, 
one can thus define T* directly by ()5.1ip . noting that this gives consistent 
distributions for m = 1, 2, . . . , see Kesten [TJ].) See further Section r21.2[ 
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6. The Ulam-Harris tree and convergence 

It is convenient, especially when discussing convergence, to regard our 
trees as subtre es of th e infinite Ulam-Harris tree defined as follows. (See 
e.g. Otter [oi] . iHarrisI (ill. §VI.2], Neveu jlH and Kesten Q.) 



Definition 6.1. The Ulam-Harris tree C/qo is the infinite rooted tree with 
node set Voo '■= UfcLo^i' finite strings h---ik of positive 

integers, including the empty string which we take as the root o, and with 
an edge joining h - ■ ■ ik and ii • • • i^+i for any ^ and ii, . . . , ik+i G Ni. 

Thus every node v = ii ■ ■ ■ ik has outdegree d'^{v) = oo; the children of 
V are the strings vl, v2, v3, . . . , and we let them have this order so Uoo 
becomes an infinite ordered rooted tree. The parent of ii • • • {k > 0) is 
«i • • • ik-i- 

The family T of ordered rooted trees can be identified with the set of all 
rooted subtrees T of Uoo that have the property 

h---ikie V{T) =^ ii--- ikj e V{T) for all j ^ i. (6.1) 

Equivalently, by identifying T and its node set V{T), we can regard T as 
the family of all subsets V of Voo that satisfy 

G (6.2) 
ii • • • ik+i G V ii---ik £V, (6.3) 
ii ■ ■ ■ iki ^ y =^ ii • • • ikj G V for all j ^ i. (6-4) 

We let Tf := {T G T : |T| < oo} be the set of all finite ordered rooted 
trees and := {T G T : \T\ = n} the set of all ordered rooted trees of size 
n. 

If T G T, we let as above d'^{v) = d^{v) denote the outdegree of v for 
every v G V{T), For convenience, we also define d'^{v) = for u ^ ^(^); 
thus d'^{v) is defined for every v G Voo, and the tree T G T is uniquely 
determined by the (out)degree sequence {d'^{v))v^v^- It is easily seen that 

this gives a bijection between T and the set of sequences (4) G No with 
the property 

dii-ifci = when i > di^...d^. (6.5) 

The family Tif of locally finite trees corresponds to the subset of all such 
sequences with all d^ < oo, and the family Tf of finite trees correspond to 
the subset of all such sequences {d^) with all d^ < oo and only finitely many 
d„ / 0. 

In this way we have % C In C 1 C N^°° ; note that = 1 n N^°° , so 
Tf C Tif C N^°°. 

We give No the usual compact topology as the one-point compactification 
of the discrete space Nq. Thus No is a compact metric space. (One metric, 
among many equivalent ones, is given by the homomorphism n i— >• l/(n + 1) 

onto {1/n}^]^ U {0} C M.) We give No°° the product topology and its 
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subspaces If, Tif and T the induced topologies. Thus Nq°° is a compact 
metric space, and its subspaces Tf , Tif and T are metric spaces. (The precise 
choice of metric on these spaces is irrelevant; we will not use any explicit 
metric except briefly in Section [191) Moreover, the condition (j6.5p defines 

T as a closed subset of No°°; thus T is a compact metric space. (Tf and Tif 
are not compact. In fact, it is easily seen that they are dense proper subsets 
of T. Tf is a countable discrete space.) 

In other words, if T„ and T are trees in T, then T„ — )• T if and only if the 
outdegrees converge pointwise: 

^T„(^) ~^ dj,{v) for each v E Voo- (6.6) 

It is easily seen that it suffices to consider v £ V{T), i.e., ()6.6p is equivalent 
to 

4^(v)^4(?;) for each V £V{T), (6.7) 

since (16. 7j) implies that if u ^ ^(T), then v ^ V{Tn) for sufficiently large n, 
and thus (v) = 0. (Consider the last node w in V{T) on the path from 
the root to v and use d^^{w) — >• d^{w).) 

Alternatively, we may as above consider the node set V{T) as a subset of 
Voo and regard T as the family of all subsets of Voo that satisfy ()6.2p ~ ()6.4p . 
We identify the family of all subsets of Voo with {0,1}^°°, and give this 
family the product topology, making it into a compact metric space. (Thus, 
convergence means convergence of the indicator l{v G •} for each v £ Voo-) 
This induces a topology on T, where Tn ^ T means that, for each v G Voo, 
if f G V{T), then v G V{Tn) for all large n, and, conversely, if u ^ ^(T), 
then V ^ V{Tn) for all large n. 

If V = ii . . . ik with A; > 0, then v G V(T) if and only oiik ^ d^{ii . . . ik-i)- 
It follows immediately that V{Tn) — t- T^(T) in the sense just described, if and 
only if (16. 6p holds. The two definitions of T„ — )• T above are thus equivalent 
(for T, and thus also for its subsets Tf and Tif). 

Furthermore, we see, e.g. from (16. Op . that the convergence of trees can be 
described recursively: Let T(j) denote the j:th subtree of T, i.e., the subtree 
rooted at the j:th child of T, for j = 1, . . . , d^{o). (We consider only finite 
J, even when d^{o) = oo.) Then, T„ — )• T if and only if 

(i) the root degrees converge: d^^{o) — d^{o), and further, 
(n) for each j = 1, . . . , d;^(o), r„_(j) T(^jy 

(Note that Tn^(j) is defined for large n, at least, by (i).) 

It is important to realize that the notion of convergence used here is a local 
(pointwise) one, so we consider only a single u at a time, or, equivalently, a 
finite set of v] there is no uniformity in v required. 

If T is a locally finite tree, T G Tif, then d^(u) < oo for each and thus 
()6.6p means that for each v, d^^{v) = d~^{v) for all sufficiently large n. 

Let T*^™') denote the tree T truncated at height m, i.e., the subtree of T 
consisting of all nodes in generations 0, . . . , m. If T is locally finite, then 
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each T^™^ is a finite tree, and it is easily seen from (|6.7p that convergence 
to T can be characterised as fohows: 

Lemma 6.2. IfT is locally finite, then, for any trees Tn E 
Tn^T ^ T^"^) ^ r(") for each m 

<^=^ j'C'") = j'i^) jgj. QQQfi ui Qjid large n. 
(The last condition means for n larger than some n{m) depending on m.) □ 

This notion of convergence for locaUy finite trees is widely used; see e.g. 
Otter [933 and Aldous and Pitman 

In general, if T is not locally finite, this characterization fails. (For ex- 
ample, if Sn, 1 ^ n ^ 00, is a star where the root has outdegree n and 
its children all have outdegree 0, then Sn Soo, but sij^^ 7^ for 
all n and m ^ 1.) Instead, we have to localise also horizontally: Let 
ylm] ._ IJ^^ji^ . . . ,rn}'', the subset of Voo consisting of strings of length 
at most m with all elements at most m. For a tree T G T, let T'"*! be the 
subtree with node set V{T) D V^"^\ i.e., the tree T truncated at height m 
and pruned so that all outdegrees are at most m. It is then easy to see from 
()6.6p that the following analogue and generalization of Lemma 16.21 holds: 

Lemma 6.3. For any trees T, Tn G T, 

Tn^T ^ tH ^ tH for each m 

<^=^ = jgj. QQQfi ui Qjid all large n. 

(The last condition means for n larger than some n{m) depending on m.) □ 

Our notion of convergence for general trees T G T was introduced in this 
form by Jonsson and Stefansson |67| (where the truncation TI™] is called a 
left hall). 

Remark 6.4. It is straightforward to obtain versions of Lemmas I6.2fl6.3l 
for random trees T, T^ and convergence in probability or distribution. For 
example: For any random trees T, T^ G T, 

T„ A T ^ tH a r[™] for each m. (6.8) 

If T G Tif, a.s., then we also have 

T„ A T ^ T^"^) A T^"^) for each m, (6.9) 

see e.g. Aldous and Pitman [5']. The proofs are standard using the methods 
in e.g. Billingsley [15]. 

7. Main result for simply generated random trees 

Our main result for trees is the following, proved in Section [151 The case 
when V 1 was shown implicitly by Kennedy |73,] (who considered Galton- 
Watson processes and not trees), and explicitly by Aldous and Pitman 0], 
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see also Grimmett [47|, Kolchin [7a], Kesten |7^] and Aldous [4,]- Special 
cases with < < 1 and = are given by Jonsson and Stefansson [g^] 
and Janson, Jonsson and Stefansson [6^, respectively. 

Theorem 7.1. Let w = {wk)k^o be any weight sequence with wq > and 
Wk > for some k ^ 2. 

(i) If V ^1, let T be the unique number in [0,p] such that ^{t) = 1. 

(ii) If v < 1, let T := p. 

In both cases, ^ r < oo and < ^(t) < oo. Let 

then (7rfc)fc^o is a probability distribution, with expectation 

^ = ^(r) = min(i/, 1)^1 (7.2) 

and variance cr^ = r^''(T) ^ oo. Let T be the infinite modified Galton- 
Watson tree constructed in Section \M for the distribution {TTk)k^Q. Then 

Tn — > T as n ^ oo, in the topology defined in Sectionl^ 



Furthermore, in case (i) , /U = 1 (the critical case) and T is locally finite 



with an infinite spine; in case (ii) p = u < 1 (the subcritical case) and T 



has a finite spine ending with an explosion. 

Remark 7.2. Note that we can combine the two cases v ^ \ and z/ < 1 
and define, using Lemma l3. II and with ^{p) = 

r := maxjt ^ p : ^'(t) ^ f|. (7.3) 



Remark 7.3. In case (ii) , there is no r ^ with ^'(t) = 1, see Lemma [3. 11 
Hence the definition of r can also be expressed as follows, recalling ^{t) := 
t^'{t)/^{t) from ()3.6p : r is the unique number in [0,p] such that 

r$'(T) = $(t), (7.4) 

if there exists any such r; otherwise r := p. (Equation (17. 4p is used in many 
papers to define r, in the case ^ 1.) 

Remark 7.4. U < t < p, then 

d /$(t)\ t<^>'{t) - $(t) $(t) 



dt V t 

Since ^{t) is increasing by Lemma |3.H it follows that ^{t)/t decreases on 
[0, r] and increases on [r, p], so r can, alternatively, be characterised as 
the (unique) minimum point in [0, p] of the convex function ^(t)/t, cf. e.g. 
Minami [8^ and Jonsson and Stefansson [g^]. Consequently, 

£W = ,nt M = i„f M. (7.5) 

r O^t^p t Os£t<oo t 

(This holds also when p = 0, trivially, since then ^{t)/t = oo for every 
t > 0.) 
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Remark 7.5. By Remark 17.41 r is, equivalently, the (unique) maximum 
point in [0, p] of t/^{t), which by (|3.13p is the inverse function of the gener- 
ating function 2(z). It fohows easily that 

r = Z{pz), (7.6) 

where pz = t/^{t) is the radius of convergence of Z; see also Corol- 
larv ll7J/n Note that ^ ,92 < oo and that pz = r = p = 0. 

Otter uses (17. 6p as the definition of r (by him denoted a) ; see also Mi- 
nami |89(]. 

Remark 7.6. When 1^ = (which is equivalent to p = 0), the limit T is the 
non-random infinite star in Example 15.11 so Theorem 17.11 gives Tn — ^ T- 

Remark 7.7. We consider briefly the cases excluded from Theorem 17.11 
The case when i(;o = is completely trivial, since then w{T) = for every 
finite tree, so Tn is undefined. The same holds (for n ^ 2) when wq > but 
Wk = for all k ^ 1, i.e., when w = 0. 

The case when tt;o > and wi > but Wk = for A; ^ 2, so w = 1, is 
also trivial. Then w(T) = unless T is a rooted path P„ for some n. Thus 
Zn = w{Pn) = woWi~^ , and (a.s.) Tn = Pn, which converges as n — )• c« to 
the infinite path Pqo- We have i' = 1 = uj, but, in contrast to Theorem 17.11 
r = oo, with r defined e.g. by (j7.3p . Further, interpreting (j7.ip as a limit, we 
have TTfc = (5fci, so (vr^,) is the distribution concentrated at 1; thus (|5.2p yields 
^ = 1 a.s., so T consists of an infinite spine only, i.e. T = Poo- Consequently, 
Tn — > T holds in this case too. 

Remark 7.8. If we replace (wk) by the equivalent weight sequence (i?fe) 
given by (14. ip . then (17. 3|) and (14. 5p show that r is replaced by 

r := max{t ^ p : $(t) ^ 1} = max{t ^ p/b : "^{bt) ^ 1} = r/b. (7.7) 

The corresponding probability weight sequence given by (j7.ip thus is, using 

(Ban, 

so the distribution (vTfc) is invariant and depends only on the equivalence 
class of {wk)- 

Remark 7.9. If /) > 0, then r > and the distribution (vr^) is a probability 
weight sequence equivalent to {wk). There are other equivalent probability 
weight sequences, see Lemma [4. 11 but Theorem 17. II and the theorems below 
show that (vTfc) has a special role and therefore is a canonical choice of a 
weight sequence in its equivalence class. Remark 14.31 shows that (vr^) is the 
unique probability distribution with mean 1 that is equivalent to (fWfc), if 
any such distribution exists. If no such distribution exists but p > 0, then 
(vTfc) is the probability distribution equivalent to {wk) that has the maximal 
mean. 
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A heuristic motivation for this choice of probabihty weight sequence is 
that when we construct 7^ as a Galton-Watson tree T conditioned on 
\T\ = n, it is better to condition on an event of not too smah probabil- 
ity; in the critical case this probability decreases as n~^/^ provided < oo, 
see [13] > 1) and [76, Theorem 2.3.1] {u ^ 1, cr^ < oo), and always 
subexponentially, but in the subcritical and supercritical cases it typically 
decreases exponentially fast, see Theorems 117.71 and 117.111 

As a special case of Theorem 17. II we have the following result for the root 
degree dj-^ (o) , proved in Section [lH 



Theorem 7.10. Let {wk)k^o (^i^d (vrfc)fc^o in Theorem 7.1 Then, as 

n — >• oo, 

P(d+ (o) = d) ^ dvTd, d^O. (7.9) 
Consequently, regarding dj-^{o) as a random number in No, 

4jo)Ae, (7.10) 

where ^ is a random variable in Nq with the distribution given in (j5.2p . 



Note that the sum dvr^ = ;U of the limiting probabilities in (j7.9p may 
be less than 1; in that case we do not have convergence to a proper finite 
random variable, which is why we regard dj- (o) as a random number in Nq. 

Theorem 17.101 describes the degree of the root. If we instead take a ran- 
dom node, we obtain a different limit distribution, viz. (vTfc). We state two 
versions of this; the two results are of the types called annealed and quenched 
in statistical physics. In the first (annealed) version, we take a random tree 
Tn and, simultaneously, a random node v in it. In the second (quenched) 
version we fix a random tree Tn and study the distribution of outdegrees in 
it. (This yields a random probability distribution. Equivalently, we study 
the outdegree of a random node conditioned on the tree Tn-) 



Theorem 7.11. Let {wk)k^o and {iTk)k^o in Theorem \7.1\ 

(i) Let V be a uniformly random node in Tn- Then, as n ^ oo, 

F{d+^{v) = d) ^ TTd, d^O. (7.11) 

(ii) Let Nd be the number of nodes in Tn of outdegree d. Then 

— ^TTd, d^O. (7.12) 
n 

The proof is given injection 1161 (When u > 1, this was proved by Otter 
[o^ ]. see also Minami [89].) See Section [20.21 for further results. 

Instead of considering just the outdegree of a random node, i.e., its num- 
ber of children, we may obtain a stronger result by considering the subtree 
containing its children, grandchildren and so on. (This random subtree is 
called a fringe subtree by Aldous [2].) We have an analogous result, also 



proved in Section [TBI Cf . [2] , which in particular contains (i) below in the 



case u ^ 1 and cr^ < oo; this was extended by Bennies and Kersting Ul to 



SIMPLY GENERATED TREES AND RANDOM ALLOCATIONS 



23 



the general case u 1. (Note that the hmit distribution, i.e. the distribution 
of T, is a fringe distribution in the sense of only if = 1, i.e., if and only 
if ^ 1.) 

Theorem 7.12. Let {wk)k^o o.'^d {'Kk)k'^o be as in Theorem \ 7. 1\ and let T 

he the Galton-Watson tree with offspring distribution (vr/j). Further, if v is 
a node in Tn, let Tn-v be the subtree rooted at v. 

(i) Let V be a uniformly random node in Tn- Then, Tn-v T, i.e., for 
any fixed tree T , 

nTn;v = T)^¥{T = T). (7.13) 

(ii) Let T be an ordered rooted tree and let Nt ■= \{v : Tn-v = T}\ be the 
number of nodes in Tn such that the subtree rooted there equals T. 
Then 

— ^F(T = T). (7.14) 
n 

Remark 7.13. Aldous (3] considers also the tree obtained by a random re- 
rooting of Tn, i.e., the tree obtained by declaring a uniformly random node 
V to be the root. Note that this re-rooted tree contains Tn subtree, 
and that, provided v ^ o, there is exactly one branch from the new root 
not in this subtree, viz. the branch starting with the original parent of v. 
Aldous [2I shows, at least when u ^ 1 and o"^ < 00, convergence of this 
randomly re-rooted tree to the random sin-tree in Remark 15.31 The limit of 
the re-rooted tree is thus very similar to the limit of Tn in Theorem 17. H but 
not identical to it. 

8. Three different types of weights 

Although Theorem l7.1l has only two cases, it makes sense to treat the case 
p = separately. We thus have the following three (mutually exclusive) cases 
for the weight sequence (wk). 

I. u ^ 1. Then < r < cxd and t ^ p ^ 00. The weight sequence (wk) 
is equivalent to (vr^), which is a probability distribution with mean 
fi = "^{t) = 1 and probability generating function YlT=o'^kZ^ with 
radius of convergence p/r ^ 1. 
II. < < 1. Then < r = p < 00. The weight sequence (wk) 
is equivalent to (vTfc), which is a probability distribution with mean 
p = ^I'(r) < 1 and probability generating function Yl'h=o'^kZ^ with 
radius of convergence p/r = 1. 
III. I' = 0. Then t = p = 0, and (wk) is not equivalent to any probability 
distribution. 

If we consider the modified Galton- Watson tree in Theorem 17. H then III 
is the case discussed in Example I5.lt excluding this case, I and II are the 
same as (Tl) and (T2) in Section [H 

We can reformulate the partition into three cases in more probabilistic 
terms. If ^ is a non-negative integer valued random variable with distribution 



24 



SVANTE JANSON 



given by pk = P(C = k), k ^ 0, then the exponential moments of ^ are 
= T.t=oPkR^ for R>1. (Equivalents, Ee''« for r := logR > 0.) We 
say that X, or the distribution (pk), has some finite exponential moment if 
ER^ < oo for some R > 1; this is equivalent to the probability generating 
function Yl'k^=oPkz'^ having radius of convergence strictly larger than 1. 

Consider again a probability distribution (wk) equivalent to (wk), with 
Wk = t^Wk/^{t) for some t ^ p. By Section HJ the radius of convergence of 
the probability generating function ^{z) of this distribution is p/t, cf. (j4.4p . 
Hence, the distribution {iuk) has some finite exponential moment if and only 
if < t < /9. The cases I-III can thus be described as follows: 

I. u ^ 1. Then (wk) is equivalent to a probability distribution with 
mean p- = 1 (with or without some exponential moment). Moreover, 
(vTfc) in (jT.ip is the unique such distribution. 
II. < < 1. Then (wk) is equivalent to a probability distribution 
with mean p < 1 and no finite exponential moment. Moreover, (vTfc) 
in ()7.ip is the unique such distribution. 
III. u = 0. Then (wk) is not equivalent to any probability distribution. 

Case I may be further subdivided. From an analytic point of view, it is 
natural to split I into two subcases: 

la. z/ > 1; equivalently, < r < p ^ oo. The weight sequence (wk) 
is equivalent to (vTfc), which is a probability distribution with mean 
p = I and probability generating function X^fclo '^fc-^^ with radius 
of convergence p/r > 1. In other words, (iffc) is equivalent to a 
probability distribution with mean p = 1 and some finite exponential 
moment. (Then (vr^) is the unique such distribution.) By (j7.6p . the 
condition can also be written analytically as Z{pz) < p, a version 
used e.g. in [sl]. (This case is called generic in [35| and (67|.) 

lb. u = 1; then < r = /? < oo. The weight sequence (wk) is equivalent 
to (vTfc), which is a probability distribution with mean 1 and prob- 
ability generating function Yl'k=o "^k^^ with radius of convergence 
p/t = 1. In other words, (w^) is equivalent to a probability distri- 
bution with mean p = 1 and no finite exponential moment. (Then 
(vTfc) is the unique such distribution.) 

Case la is convenient when using analytic methods, since it says that the 
point r is strictly inside the domain of convergence of which is convenient 
for methods involving contour integrations in the complex plane. (See e.g. 
Drmota [s^ ] for several such results of different types.) For that reason, 
many papers using such methods consider only case la. However, it has 
repeatedly turned out, for many different problems, that results proved by 
such methods often hold, by other proofs, assuming only that we are in 
case I with finite variance of (tt^). (In fact, as shown in [59l], it is at least 
sometimes possible to use complex analytic methods also in the case when 
T = p and (vTfc) has a finite second moment.) Consequently, it is often more 
important to partition case I into the following two cases: 
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la. z/ ^ 1 and (vr^) has variance cj^ < oo. In other words, {wk) is 

equivalent to a probabihty distribution (vTfc) with mean /i = 1 and 

finite second moment cj^. 
1/3. z/ = 1 and (vTfc) has variance o"^ = oo. In other words, {wk) is 

equivalent to a probability distribution with mean /J, = 1 and infinite 

variance. 

Note that la is a subcase of la, since a finite exponential moment implies 
that the second moment is finite. 

When v ^ 1, the quantity o"^ is another natural parameter of the weight 
sequence (wk), which frequently occurs in asymptotic results, see e.g. Sec- 
tion [20l (When u < 1, the natural analogue is oo, see Remark 15.51 ) By 
Theorem 17.11 (or (14.10p ). cr^ = r^'(r), so (assuming u ^ 1), we have case 
la when ^'{t) < oo and 1/3 when ^''(r) = oo. Moreover, when u ^ 1, then 
(vTfc) has mean /U = 1, and it follows from ()4.8p that the variance cr^ of (vr^) 
also is given by the formula [3] 

a' = $-(1) + /. - /.^ = cl>^(l) = (8.1) 

Hence la is the case v ^ 1 and $"(t) < oo; equivalently, either > 1 or 
= 1 and ^"(p) < oo. 

Remark 8.1. We have seen that except in case III, we may without loss of 
generality assume that the weight (wk) is a probability weight sequence. If 
this distribution is critical, i.e. has mean 1, we are in case I with tt^ = Wk, 
so we do not have to change the weights. 

If the distribution (wk) is supercritical, then v > 1 and we are in case la; 
we can change to an equivalent critical probability weight. Hence we never 
have to consider supercritical weights. (Recall that by Remark 14.31 ^ is the 
supremum of the means of the equivalent probability weight sequences.) 

If the distribution {wk) is subcritical, we can only say that we are in case 
I or II. We can often change to an equivalent critical probability weight, but 
not always. 



9. Examples of simply generated random trees 

One of the reasons for the interest in simply generated trees is that many 
kinds of random trees occuring in various applications can be seen as simply 
generated random trees and conditioned Galton- Watson tree. We give some 
important examples here, see further Aldous Devroye fs^] and Drmota 

a. 

We see from Theorem 17. 1 1 and Section[8]that any simply generated random 
tree defined by a weight sequence with /) > can be defined by an equivalent 
probability weight sequence, and then the tree is the corresponding condi- 
tioned Galton-Watson tree. Moreover, the probability weight sequence (tt^) 
defined in (|7.ip is the canonical choice of offspring distribution. Recall that 
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(vTjfc) is characterised by having mean 1, whenever this is possible (i.e., in 
case I), i.e., we prefer to have critical Galton- Watson trees. 

Example 9.1 (ordered trees). The simplest example is to take Wk = 1 for 
every k ^ 0. Thus every tree has weight 1, and 7^ is a uniformly random 
ordered rooted tree with n nodes. Further, Z„ is the number of such trees; 
thus Zn is the Catalan number Cn-i, see Remark 12.21 and ()2.ip . (For this 
reason, these random trees are sometimes called Catalan trees.) 
We have 



fe=0 

and 

t$'(t) 



^(t) = = . (9.2) 

$(t) 1-t ^ ' 

Thus p = 1 and z/ = oo (cf. Lemma l3.]](iv) ), and ^(r) = 1 yields r = 1/2. 
Hence (jT.ip yields the canonical probability weight sequence 

TTfc = 2-*^~\ k^O. (9.3) 

In other words, the uniformly random ordered rooted tree is the conditioned 
Galton- Watson tree with geometric offspring distribution ^ ~ Ge(l/2). 
(This is the geometric distribution with mean 1. Any other geometric distri- 
bution yields an equivalent weight sequence, and thus the same conditioned 
Galton- Watson tree.) 

The size-biased random variable ^ in (j5.2p has the distribution 

F{(^= k) = kTrk = k2-''-^, k^l; (9.4) 

thus ^ — 1 has a negative binomial distribution NBin(2, 1/2). It follows that 
in the infinite tree T, if f is a node on the spine (for example the root) and 
d^{v),dP"{v) are the numbers of children of it to the left and right of the 
spine, respectively, then 

^{d^iv) = j and d^(v) = k) = P(e = j + k + I) = 2'^-^''^ 

j + k + 1 

= 2-'J-^ ■2-''-\ j.k-^Q- (9.5) 

thus d}"{v) and dP"(v) are independent and both have the same distribution 
Ge(l/2) as 

We have := Var^ = t^!'{t) = 2, see Theorem O and (f8T]l . and 
= + 1 = 3, see (1531) . 



Example 9.2 (unordered trees). We have assumed that our trees are or- 
dered, but it is possible to consider unordered labelled rooted trees too by 
imposing a random order on the set of children of each node. Note first that 
for ordered trees, the ordering of the children implicitly yields a labelling 
of all nodes as in Section [6j Hence, any ordered tree with n nodes can be 
explicitly labeled by 1, . . . ,n in exactly n! ways, and a uniformly random 
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labelled ordered rooted tree is the same as a uniformly random unlabelled 
ordered rooted tree with a random labelling. (For unordered trees, a uni- 
formly random labelled tree is different from a uniformly random unlabelled 
tree; unlabelled unordered trees are not simply generated trees.) 

An unordered labelled rooted tree with outdegrees di corresponds to 
different ordered labelled rooted trees. If we take Wk = we give each 
of these ordered trees weight Hi dil~^, so their total weight is 1. Hence, the 
weight sequence (1/A;!) yields a uniformly random unordered labelled rooted 
tree. 

T he nu mber of unordered labelled unrooted trees with n nodes is n""^, see 
e.g. IO3I . Section 5.3], a result given by Cayley and known as Cayley's 



formula. (Although attr ibuted by Cayley to Borchardt [17| and even earlier 
found by Sylvester jl04l |. see e.g. [1031, p. 66].) Equivalently, the number of 
unordered labelled rooted trees with n nodes is n""^. Hence random such 
trees are sometimes called Cayley trees. However, this name is also used for 
regular infinite trees. 
We have 



^W=Ef = ^* (9-6) 

A;=0 

and 

m = ^ = t. (9.7) 

^ ^ $(t) ^ ^ 

Thus = 00 and ^{t) = 1 yields r = 1. Hence (j7.ip yields the canonical 
probability weight sequence 

vTfc = — , A; ^ 0. (9.8) 

In other words, the uniformly random labelled unordered rooted tree is 
the conditioned Galton- Watson tree with Poisson offspring distribution ^ ~ 
Po(l). (Any other Poisson distribution yields an equivalent weight sequence, 
and thus the same conditioned Galton-Watson tree.) 

The size-biased random variable ^ in (j5.2p has the distribution 

P(e = k) = kTTk = — , k-^l- (9.9) 

{k-l)\ 

thus ^ — 1 has also the Poisson distribution Po(l), i.e., ^ — 1 = (It is only 
for a Poisson distribution that ^ — 1 = C.) 

We have cr^ := Var^ = t^'{t) = 1 and = o-^ + 1 = 2, cf. <^ and 



^e 



The partition function is given by 

Zn{Tv)=n\T\=n) = - (9.10) 

n\ 

This is a special case of the Borel distribution in (jll.28p below; Borel [l^ 
proved a result equivalent to (|9.10p for a queueing problem, see also Otter 
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13] , Tanner [io3], Dwass Takacs [loi], Pitman Example [lL6] and 



Theorem 114.51 below. Equivalently, using (|4.3p . 

„n-l 

Z„(w) = e"Z„(7r) = ^. (9.11) 
n! 

Recall that Zn is defined by the sum (j2.5p over unlabelled ordered rooted 
trees; if we sum over labelled ordered rooted trees, we obtain n! Zn, which 
by the argument above corresponds to weight 1 on each labelled unordered 
rooted tree; i.e., the number of labelled unordered rooted trees is n! Zn(w) = 
n^~^. Thus (19. lip is equivalent to Cayley's formula for the number of un- 
ordered trees given above. 

By (|9.1ip . the generating function Z(z) is YlnLi /n\, known as the 

the tree function; see (|11.2ip ~ (jll.24p in Example 111.61 

Example 9.3 (binary trees I). The namn binary tree is used infat least) 
two different, but related, meanings. The first version (Drmota [331 . Section 
1.2.1]), sometimes called full binary tree or strict binary tree, is an ordered 
rooted tree where every node has outdegree or 2. We obtain a uniformly 
random full binary tree by taking the weight sequence with wq = W2 = ^, 
and tUfc = for A; / 0, 2. Note that this weight sequence has span 2; this is 
the standard example of a weight sequence with span > 1. As a consequence, 
a full binary tree of size n exists only if n is odd. (This is easily seen directly; 
see Corollarv 114.61 for a general result.) 
We have 

$(t) = l + t2 (912) 

and 

, , t^'it) 2*2 
*(t) = —V = (9.13) 



Thus p = oo, V = 2 (cf. Lemma l3.1](v)p , and ^{t) = 1 yields r = 1. Hence 
(jT.ip yields the canonical probability weight sequence 

7rfc = i, fc = 0,2. (9.14) 

In other words, the random full binary tree is the conditioned Galton- 
Watson tree with offspring distribution ^ = 2X where X ~ Be(l/2). (In 
the Galton- Watson tree 7", thus each node gets either twins or no children, 
each outcome with probability 1/2.) 

The size-biased random variable has P(i^ = 2) = 1 by (j9.14p and (j5.2p . 
so ^ = 2 and ^ — 1 = 1 a.s. 

We have (j2 := Var^ = 1 and E ^= o-2 + 1 = 2, cf. ([HI]) and (1531) . 



Example 9.4 (binary trees II). The second version of a binary tree (Drmota 
[3^ . Example 1.3]) is a rooted tree where every node has at most one left 
child and at most one right child. Thus, each outdegree is 0, 1 or 2; if there 
are two children they are ordered, and, moreover, if there is only one child, 
it is marked as either left or right. (There is a one-to-one correspondence 
between binary trees of this type with n nodes and the full binary trees in 
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Example 19.31 with 2n + 1 nodes, mapping a binary tree T to a full binary tree 
T', where T' is obtained from T by adding 2 — d external nodes at every node 
with outdegree d; conversely, we obtain T by deleting all leaves in T' and 
keeping only the nodes that have outdegree 2 in T' (the internal nodes).) 

Since there are two types of nodes with outdegree 1, we obtain the correct 
count of these binary trees, and a uniformly distributed random binary tree, 
by taking the weight sequence wq = l,wi = 2, W2 = 1, and tu/c = for ^ 3, 
i.e., Wk = il). Thus, 

$(t) = l + 2t + t2 = (l + t)2 (9.15) 

and 

t<^>'(t) 2t 
^ t = —L^ = . 9.16 

^ ^ $(t) 1 + t ^ ^ 

Thus p = oo, 1/ = 2, and ^'(t) = 1 yields r = 1. Hence ()7.1|) yields the 
canonical probability weight sequence 

^k = ]('^], k^O. (9.17) 



4 

In other words, a uniformly random binary tree of this type is the condi- 
tioned Galton- Watson tree with binomial offspring distribution ^ ~ Bi(2, 1/2) 
(Any other distribution Bi(2,p), < p < 1, is equivalent and yields the same 
conditioned Galton- Watson tree.) 

The size-biased random variable ^ has by (|5.2p P(^ = 1) = P(^ = 2) = 
thus ^- 1 ~ Bi(l,l/2). 

We have := Var^ = 1/2 and E^= a"^ + 1 = 3/2, cf. (HH) and (|53D . 

Example 9.5 (Motzkin trees). A Motzkin tree is a ordered rooted tree with 
each outdegree ^ 2. The difference from Example 19.41 is that there is only 
one type of a single child. Thus we count such trees and obtain uniformly 
random Motzkin trees by taking wq = wi = W2 = 1 and Wk = 0, k ^ 3. We 
have 

$(t) = l + t + t2 (9_lg) 

and 

^'(t) = ^. (9.19 

^ ^ l + t + t^ ^ ^ 

Thus p = oo, u = 2, and ^'(t) = 1 yields r = 1. Hence (j7.1|) yields the 
canonical probability weight sequence 

TTfc = i, A: = 0,1,2. (9.20) 

In other words, a uniformly random Motzkin tree is the conditioned Galton- 
Watson tree with offspring distribution ^ uniform on {0, 1,2}. 

The size-biased random variable ^ has, by (|5.2p and (|9.20p . the distribu- 
tion P(^= 1) = i, P(^= 2) = |; thus ^- 1 ~ Bi(l,2/3). 

We have := Var^ = 2/3 and E^= + 1 = 5/3, cf. (HH) and (f53D . 
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Example 9.6 (d-ary trees). In a d-ary tree, each node has d positions where 
a child may be attached, and there is at most one child per position. (Trees 
with children attached at different positions are regarded as different trees.) 
This generalises the binary trees in Example 19.41 which is the special case 
d = 2. 

Since k children may be attached in (^) ways (with a given order), we 
obtain a uniformly random d-ary trees by taking Wk = (^) . We have 

^{t) = {l + tf (9.21) 



t<^'(t) dt 
^ t = —L^ = . 9.22 

^ ^ $(t) 1 + t ^ ^ 



and 



Thus p = oo, 1/ = u = d, and ^(r) = 1 yields t = l/{d — 1). Hence (IT.ljl 
yields the canonical probability weight sequence 

In other words, a uniformly random d-ary tree is the conditioned Galton- 
Watson tree with binomial offspring distribution ^ ~ Bi((i, 1/d). (Any other 
distribution Bi{d,p), < p < 1, is equivalent and yields the same condi- 
tioned Galton-Watson tree.) 

The size-biased random variable ^ has the distribution 

PK = *) = .,.= (^_;)(i) (V) , (9.24) 

thus ^ — 1 has the Binomial distribution Bi(d — 1, 1/d). 

We have a'^ := Var^ = l-l/d and E^= ct^ + 1 = 2 - 1/d, cf. <^ and 
(ESI). 



Example 9.7. Let /3 be a real constant and let Wk = {k + 1) ^ . (The case 
(3 = is Example [m) Then p = 1. 



If — oo < /3 ^ 1, then $(p) = oo, so = oo by ()3.10p and Lemma imjlfiv) 
If /3 > 1, then $(/)) = C(/3) < oo and 



while I' = ^{1) = oo if /3 ^ 2. Hence, see also Bialas and Burda [ic 

u = l ^ C(/3 - 1) = 2C(/3) ^ /3 = /3o = 2.47875 . . . (9.26) 

and V > 1 <^=^ — oo < /3 < /3o. (It can be shown that is a decreasing 
function of (3 for /3 > 2.) In the case (3 = (3q, when thus = 1, we further 
have fj^ = oo by (|8.ip . since <I>"(1) = oo when /? ^ 3. This is thus case 1/3, 
in the notation of Section [8l 

In the case (3 > /3q we thus have < < 1, and Tn converges to a random 
tree T with one node of infinite degree, see Theorem 17.11 and Section [5l If 
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/3 ^ /3o, then v \ and the Hmit tree T is locahy finite. We thus see a 
phase transition at /3 = /3o when we vary /3 in this example. 

Note, however, that there is nothing special with the rate of decrease 
k~^°] the value of /3o depends on the exact form of our choice of the weights 
Wk in this example, and reflects the values for small k rather than the as- 
ymptotic behaviour. For example, as remarked by Bialas and Burda [l3| . 
just changing wq would change /3o to any desired value in (2,oo). With a 
different wq, $(1) = C,{j3) — 1 + wq, and a modification of (|9.25p shows that 
the critical value /3o yielding z/ = 1 is given by, see , 

2C(/3o) - C(/3o -l) = l-wo. (9.27) 

In particular, /3o > 3 for t/^o < 1 + C(2) - 2C(3) = 0.24082 . . . ; in this case, 
for the critical /3 = /Jq, we then have u = 1 and o"^ < oo, see (|8.1|) . 

See p^] for some further analytic properties. For example, if /3o < 3 (for 
example when wq = 1), then, as (3 /3o, we have 1 — r ~ c(/3o — (3y^^^°~'^\ 
where c > and the exponent can take any value > 1. 

Example 9.8. Take Wk = kl. The generating function ^{t) = X^fc^o^'^^ 
has radius of convergence p = so we are in case III, and there exists no 
equivalent conditioned Galton-Watson tree. 

Theorem 17.11 shows that Tn converges to an infinite star, see Remark 17.61 
and Example 15.11 This means that the root degree converges in probability 
to oo, and that the outdegree of any fixed child converges to in probabil- 
ity, i.e., equals w.h.p. Note, however, that we cannot draw the conclusion 
that the outdegrees of all children of the root are w.h.p.; Theorem 17.11 
and symmetry imply that the proportion of children of the root with out- 
degree > tends to 0, but the number of such children may still be large. 
(Theorem 17. 1 l|(ii)] yields the same conclusion.) 

In fact, for this particular example Wk = k\, it is shown by Janson, Jon- 
sson and Stefansson [iS], using direct calculations, that w.h.p. all subtrees 
attached to the root have size 1 or 2, and that the number of such subtrees 
of size 2 has an asymptotic Poisson distribution Po(l). (This number thus 
w.h.p. equals A^i, and hiTn), and also the number of children of the root 
with at least one child.) 

Example 9.9. If we instead take Wk = k\°' with < a < 1, then as in 
Example 19.81 p = and Tn converges to the infinite star in Example 15.11 
In this case, if (for simplicity) 1/a ^ Ni, then Ni{7'n)/n^~^°' — — >• for 
1 ^ i ^ [1/aJ, while = w.h.p. for each fixed i > [1/aJ; furthermore, 
among the subtrees attached to the root, w.h.p. there are subtrees of all 
sizes ^ [1/aJ + 1, and all possible shapes of these trees, with the number of 
each type tending to oo in probability, but no larger subtrees. See Janson, 
Jonsson and Stefansson [gl] for details. 

If we take = kl" with a > 1, then w.h.p. 7^ is a star with n — 1 leaves, 
so A^(^ = for 1 ^ d < n - 1. 

See also the examples in Section [TTl 
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10. Balls-in-boxes 

The balls-in-boxes model is a model for random allocation of m (unla- 
belled) balls in n (labelled) boxes; here m ^ and n ^ 1 are given integers. 
The set of possible allocations is thus 

n 

I3m,n ■■= {(2/1,..., y„) G : ^2/, = m}, (10.1) 

i=l 

where yi counts the number of balls in box i. 

We suppose again that w = (tffc)^Q is a fixed weight sequence, and we 
define the weight of an allocation y = (yi, • • • , yn) as 



1=1 

Given m and n, we choose a random allocation i?m,n with probability 
proportional to its weight, i.e., 

P(Sn^,n = Y) = 177^, Y ^ ^m,n, (10.3) 

Z(m, n) 

where the normalizing factor Z(m,n), again called the partition function, is 
given by 

Z(m, n) = Z(m, n; w) := w{y). (10-4) 

We consider only m and n such that Z(m, n) > 0; otherwise Bra,n is unde- 
fined. See further Lemma [12.31 We write Bm,n = (^i, • • • , Yn)- 

Remark 10.1. The names balls-in-boxes and balls-in-bins are used in the 
literature for several different allocation models. We use balls-in-boxes for 
the model defined here, following e.g. Bialas, Burda and Johnston [l3|. 

Example 10.2 (probability weights). In the special case when {wk) is a 
probability weight sequence, let ^i, ^2, • • • be i.i.d. random variables with the 
distribution {wk). Then w{y) = P((6, ■■■,in)= y) for any y = (yi, . . . , y„). 
Hence 

Z(m,n) = P((ei, . . . ,en) e B„,,n) = = m), (10.5) 
where we define 

n 

Sn:=Y.ii- (10.6) 

i=l 

Moreover, Bm,n has the same distribution as (^i,...,6«) conditioned on 
Sn = m: 

(Yi, . . . , y„) = ((6, ...,Cn)\Sn = m). (10.7) 
We will use this setting (and notation) several times below. (This construc- 
tion of a random allocation Bm,n is used by Kolchin [76] and there called 
the general scheme of allocation.) 
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We can replace the weight sequence by an equivalent weight sequence for 
the balls-in-boxes model just as we did for the random trees in Section [H 

Lemma 10.3. Suppose that we replace the weights {wk) by equivalent weights 
{wk) where := ab^Wk with o, 6 > as in (j4.ip . Then the weight of an 
allocation y = (yi, . . . , y„) S 13m,n is changed to 

w{y) = a"6™u;(y), (10.8) 

and the partition function Z{m, n) = Z(m, n; w) is changed to 

Z{m, n) := Z{m, n; w) = a''b'^Z{m, n), (10.9) 

while the distribution of Bm,n is invariant. Thus Bm,n depends only on the 
equivalence class of the weight sequence. 

Proof. We have, by the definition (|10.2I) . 

n n n 

w{y) = Hwy^ = llaby'wy^ = a"6^"=i^» JJii;^, = a^b^'wiy), (10.10) 

i=l i=l i=l 

which shows (jlO.Sp , and (jl0.9p follows by (jl0.4p . Consequently, for every y G 
}3m,n, we have w{y)/Z{m,n) = w{y) /Z{m,n) so the probability ¥{Bm,n = 
y) in (jlO.Sp is unchanged, which completes the proof. □ 

Our aim is to describe the asymptotic distribution of the random alloca- 
tion Bm,n as m, n — )• oo; we consider the case when m/n ^ X for some real A, 
and assume for simplicity that ^ A < t^; = a;(w). (Cases with m/n — )• oo 
are interesting too in some applications, for example in Section 118.71 but 
will not be considered here. See e.g. Kolchin, Sevast'yanov and Chistyakov 



771 ]. Kolchin [70] and Pavlov [9d\ for such results in special cases.) The first 



step is to note that the distribution of Bm,n = (^i; • • • ;^n) is exchangeable, 
i.e., invariant under any permutation of Yi, . . . , y„. Hence, the distribution 
is completely described by the (joint) distribution of the numbers of boxes 
with a certain number of balls, so it suffices to study these numbers. 
For any allocation of balls y = (yi, . . . , yn) S Ng , and fc ^ 0, let 

Nk{y) := \{i : yi = k}\, (10.11) 

the number of boxes with exactly k balls. Thus, if y G Bm,n, then 



oo 

E 

k=0 k=0 



Nkiy)=n and ^A;iVfc(y)=m. (10.12) 



We thus want to find the asymptotic distribution of the random variables 
Nk{Bm,n), k = 0,1,.... Our main result is the following, which will be 
proved in Section [13] together with the other theorems in this section. 

Theorem 10.4. Let w = {wk)k^o be any weight sequence with wq > and 
Wk > for some k ^ I. Suppose that n — )• oo and m = m{n) with m/n ^ \ 
with ^ A < w. 

(i) If X^u, let T be the unique number in [0,/?] such that '^{t) = X. 
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(ii) If \ > V, let T := p. 
In both cases, ^ r < oo and < ^{t) < oo. Let 

'^k ■■= ^ ^ 0. (10.13) 

Then {TTk)k^o is a probability distribution, with expectation 

^ = ^(r) = min(A, v) (10.14) 

and variance = t^'{t) ^ oo. Moreover, for every k 0, 

NkiBm,n)/n^7rk. (10.15) 

If we regard the weight sequence w as fixed and vary A (i.e., vary m{n)), 
we see that if < < oo, there is a phase transition at X = v. 

Note that r and vr^ in Theorem 17.11 are the same as in Theorem 110.41 
with A = 1. Indeed, we will later see that the random trees correspond to 
m = n — 1 and thus A = 1. 

Remark 10.5. The argument in Remark 17.41 extends and shows that r is 
the (unique) minimum point in [0,p] of <I>(t)/t^; i.e., 

O^t^p t^ 0^t<oo t^ 

By ()10.15p . there are roughly nvr^ boxes with k balls. Summing this 
approximation over all k we would get n boxes (as we should) with a total 
of n YlT=o ^'^k = np balls. However, the total number of balls is m ~ nA, 
so in the case A > z^, (110. 14p shows that about n(A — p) = n(A — v) balls are 
missing. Where are they? 

The explanation is that the sums 'Yl'k'=o^^k{Bm,n) /"n = nT- are not uni- 
formly summable, and we cannot take the limit inside the summation sign. 
The "missing balls" appear in one or several boxes with very many balls, but 
these "giant" boxes are not seen in the limit (|10.15p for fixed k. In physical 
terminology, this can be regarded as condensation of part of the mass (= 
balls). We study this further in Section [18.61 The simplest case is that there 
is a single giant box with « (A — i')n balls. W e shall see that this happens 



in an important case (Theorem 118. 33| see also iBialas. Burda and Johnston 
Fig. 1] for some numerical examples), but that there are also other 
possibilities (Examples 118. 361(18. 38p . 

Recall that for simply generated random trees, which as said above cor- 
respond to balls-in-boxes with A = 1, Theorem 17.11 too shows that there is 
a condensation when u < X = 1 (since then /x < 1 by (|7.2p ): in this case 
the condensation appears as a node of infinite degree in the random limit 
tree T of type (T2), see Section [H We shall in Section [19] study the relation 
between the forms of the condensation shown in Theorems 17.11 and 110.41 

We further have the following, essentially equivalent, version of Theo- 
rem [THill where we assume only that m/n is bounded, but not necessarily 
convergent. 
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Theorem 10.6. Let w = {wk)k^o be any weight sequence with wq > and 
Wk > for some k ^ I. Suppose that n — )• oo and m = m(n) with m/n ^ C 
for some C < u. 

Define the function t : [0, oo) — )• [0,oo] by t{x) := sup{t ^ p : 'l'(t) ^ x}. 
Then t{x) is the unique number in [0,p] such that ^'(r(x)) = x when x ^ i', 
and t{x) = p when x ^ z^; furthermore, the function x i— )■ t{x) is continuous. 
We have ^ T{m/n) < oo and < <l>(r(m/n)) < oo, and for every k ^ 0, 

n ^[T{m/n)) 

Furthermore, for any C < uj, this holds uniformly as n —)• 00 for all m = 
m{n) with m/n ^ C . 

Returning to the random variables Y\, . . . ,Yn, we have the following re- 
sult, which is shown by a physicists' proof by Bialas, Burda and Johnston 



14(. 



Theorem 10.7. Let w = {wk)k^o be any weight sequence with wq > and 
Wk > for some k ^ 1. Suppose that n — )• 00 and m = m{n) with m/n ^ \ 
where ^ A < and let {iTk)k^o be as in Theorem \10.4\ Then, for every 
l^l and yi,...,ye^O, 

e 

F{Yi = yi,...,Ye = ye)^l[Try^. (10.18) 

i=l 

In other words, for every fixed I, the random variables Yi, . . . ,Yg converge 
jointly to independent random variables with the distribution {TTk)k^o. 

A more fancy way of describing the same result is that the sequence 
Yi, . . . ,Yn, arbitrarily extended to infinite length, converges in distribution, 
as an element of Ng°, to a sequence of i.i.d. random variables with the 
distribution {'7Tk)k^o- (See e.g. [15, Problem 3.7].) 

Remark 10.8. We have assumed tt^o > in the results above for con- 
venience, and because this condition is necessary when discussing simply 
generated trees, which is our main topic. The balls-in-boxes model makes 
sense also when wq = 0, but this case is easily reduced to the case wq > 0: 
Let a := minjfc : > 0}. If a > 0, then this means that each box 
has to have at least a balls. (In particular, we need m ^ an.) There 
is an obvious correspondence between such allocations in Bm,n and alloca- 
tions in Bm-an,n obtained by removing a balls from each box. Formally, 
if y = (2/1, ■■■,yn) e Sm,n let y = (yi,^. . ,y„) with tji '■=JJi^ - «, and note 
that if we shift the weight sequence to Wk ■= Wk+a, then w{y) = w{y); thus 
Bfn^n has the same distribution as Bm-an,n for w, with a extra balls added 
in each box. It follows easily that the results above hold also in the case 
wq = 0. (We interpret WkT^ /^{t) for r = as the appropriate limit value. 
Note also that it is essential to use (j3.2p and not (jS.Sp when wq = 0.) 
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Remark 10.9. Similarly, we can always reduce to the case span(w) = 1: If 
span(w) = d, then the number of balls in each box has to be a multiple of d, 
so we may instead consider an allocation of m/d "superballs" , each consist- 
ing of d balls. This means replacing each Yi by Yi/d and using the weight 
sequence (wdk)- We prefer, however, to allow a general span in our theorems, 
for ease of our applications to simply generated trees where the correspond- 
ing reduction is more complicated. (For trees, we may replace each branch 
by a d-fold branch. In the probability weight sequence case with Galton- 
Watson trees, this replaces the random variable C by (.^i + • • • + S,d)/d, with 

(,i = i.i.d., but the roots gets a different offspring distribution S,/d; more 
generally, for a general weight sequence w, we replace <I>(t) by ^{t^^'^)'^, ex- 
cept at the root where we use different weights with the generating function 
We will not use this and leave the details to the reader.) 

Remark 10.10. We have assumed m/n — )• A < a; in Theorems ll0.4l and ll0.7l 
and similarly m/n ^ C <a;in Theorem 110. 6j hence, for n large at least, 
m/n < 00. In fact, m/n ^ oo is trivially necessary, see Lemma 112.31 When 

< oo, the only remaining case (assuming m/n converges) is thus m/n ^ uo 
with m/n ^ w; in this case, it is easy to see that (110. ISp and (110. ISp hold 
with TTui = I and vTfc = 0, A: / w. (This can be seen as a limiting case of 
(I10.13|) with T = oo.) 

In fact, if a; < oo, so the boxes have a finite maximum capacity oj, then 
the complementation ^ uj — yi yields a bijection of Bm,n onto B^n-m,n-, 
which preserves weights if {wk) simultaneously is reflected to w := (wtj-fc)- 
Hence, Bm,n corresponds to Bi^n-m,n (for w), and results for m/n —)• a; < oo 
follow from results for m/n — )• 0. 

As said above, we do not consider the case cj = oo and m/n oo, when 
the average occupancy tends to infinity. 



11. Examples of balls-in-boxes 

Apart from the connection with simply generated trees, see Section [14 
the balls-in-boxes model is interesting in its own right. 

We begin with three classic examples of balls-in-boxes, see e.g. iFellerl [H 



II. 5] and Kolchin [76'], followed by further examples from probability theory, 
combinatorics and statistical physics, including several examples of random 
forests. (We return to these examples of random forests in Section 118. 7^ 
where we study the size of the largest tree in them.) 

Example 11.1 (Maxwell-Boltzmann statistics; multinomial distribution). 
Consider a uniform random allocation of m labelled balls in n boxes. This 
is the same as throwing m balls into n boxes at random, independently 
and with each ball uniformly distributed. (In statistical mechanics, this 
is known as the Maxwell-Boltzmann statistics.) It is elementary that the 
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resulting random allocation (Yi, ... ,1^) has a multinomial distribution 
F((y„...,y„) = (y„...,,.))=n-'"( ,)=mln-flj^. 

\yii ■ ■ ■ 1 ynj ill- 

(11.1) 

If we take Wk = 1/^!, we see that the probabilities in (lll.ip and (jlO.Sp 
are proportional, and thus must be identical, so the weight sequence (1/A;!) 
yields the uniform random allocation of labelled balls. We see also that then 

Z{m,n) = vJ^/m\. (11.2) 

Alternatively, we may take a Poisson distribution Po(a): Wk = a^e~°'/k\; 
this is an equivalent weight sequence for any a > 0. We see directly that 
then Sn ~ Po (na) so (110. Sh yields 

Z(m,n) = (na)"'e-"7m!; (11.3) 

hence we see again that (jlO.Sp and (jll.ip agree. 

Comparing with Example 19.21 and using Lemma 116.11 below, we see that 
the multiset of degrees in a random unordered labelled tree of size n has 
exactly the distribution obtained when throwing n — 1 balls into n boxes at 
random. 

With Wk = l/k\ we have, as in Example 19.21 (|9.6p - (|9.7p and p = uj = 
V = oo. Hence, if m/n — t- A, we have r = A and thus vr^, = \^e~'^ /k\, so 
(tt/s) is the Po(A) distribution, which thus is the canonical choice of weights. 
(In the asymptotic case; for given m and n one might choose Po(m/n), cf. 

(nnnD.) 

Theorem 110.71 (or (110. 15p ) shows that if m/n — t- A < c«, then the asymp- 
totic distribution of the numbers of balls in a given urn is Po(A). 

The idea to study the multinomial distribution as a vector of i.i.d. Poisson 
variables conditioned on the sum is an old one that has been used repeatedly. 



see e.g. Kolchin, Sevast'yanov and Chistyakov [73|, Hoist [52, l53|, Kolchin 
fzi, Janson il. 



Example 11.2 (Bose-Einstein statistics). The weight sequence Wfe = 1 
yields a uniform distribution over all allocations of m identical and indistin- 
guishable balls in n boxes; thus each allocation {Yi, . . . , 1^) G Bra,n has the 

same probability l/\Bm,n\ = 1/{"'^Z~^) ■ 

This is known as Bose-Einstein statistics in statistical quantum mechan- 
ics; it is the distribution followed by bosons. (In the simple case with no 
forces acting on them.) 

Comparing with Example 19.11 and using Lemma 116.11 below, we see that 
the multiset of degrees in a random ordered tree of size n has exactly the 
distribution obtained by a uniform random allocation of n — 1 balls into n 
boxes. 



38 



SVANTE JANSON 



As in Example 19.11 we have (I9.ip - (l9.2p and p = 1, v = oo. If m/n — )• A < 
oo, then the equation ^'(t) = A is, by (I9.2p . t/(1 — t) = A, and thus 

T = ^— . (11.4) 

1 + A ^ ^ 

Any geometric distribution Ge(p) with < p < 1 is a weight sequence 
equivalent to (wk), and (lll.4p shows that the canonical choice (17.ip is, using 

A^ 

which is the distribution Ge(l — r) = Ge(l/(A + 1)). By Theorem 110.71 this 
is also the asymptotic distribution of balls in a given urn. 
See also Hoist [13,153] and Kolchin [TQ]. 

Example 11.3 (Fermi-Dirac statistics). The other type of particles in sta- 
tistical quantum mechanics is fermions; they exclude each other (the Pauli 
exclusion principle) so all allocations of them have to satisfy 1^ ^ 1, i.e., 
Yi S {0, 1}. A random allocation uniform among all such possibilities is 
known as Fermi-Dirac statistics] this is thus equivalent to a uniform ran- 
dom choice of one of the (^) subsets of m boxes. 

We obtain this distribution by the choice wq = wi = 1 and tOfc = for 
k ^ 2; thus 

<^>{t) = l + t (11.6) 

and 

*(«) = ^^ (11.7) 

We have p = oo and = co = 1. (Formally, (I11.6P is the case d = 1 of (I9.2ip . 
but note that we assume d ^ 2 m Example 19.61 ) 

If m/n —7- A < 1, we thus have a rather trivial example of the general 
theory with r/(l + r) = A and thus 

r=^. (11.8) 

and (tt/;) = (1 — A, A, 0, 0, . . . ), i.e., the Bernoulli distribution Be(A). (Any 
Bernoulli distribution Be(p) with < p < 1 is equivalent.) 

Since a; = 1, the corresponding conditioned Galton- Watson tree is triv- 
ially the deterministic path a case which we have excluded above. 



Example 11.4 (Polya urn [53l|). Consider a multicolour Polya urn contain- 
ing balls of n different colours, see Eggenberger and Polya [37|. Initially, 
the urn contains a > balls of each colour. Balls are drawn at random, 
one at a time. After each drawing, the drawn ball is replaced together with 
b > additional balls of the same colour. (It is natural to take a and b to be 
integers, but the model is easily interpreted also for arbitrary real a,b > 0, 
see e.g. (58|.) 

Make m draws, and let Yi be the number of times that a ball of colour i 
is drawn; then (Yi, . . . , y„) is a random allocation in n . 
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A straightforward calculation, see [33], [66|, [53(, shows that 

A nr=i a{a + b)---{a + {yi - 1)6) 



7/1, ... , UnJ na{na + h) ■ ■ ■ {na + (m — 1)6) 

a/b+y,-l\ (11-9) 



/na/fe+m— 1\ 
\ m / 

Hence, as noted by Hoist [53], this equals the random allocation given by 
the weights 

-^^={'^'\'~')=(-^)i~T)^ . = 0,1,.... (11.10) 

Note that the case a = b yields Wk = 1 and the uniform random allocation 
in Example 111.21 (Bose-Einstein statistics). We have 

ci>(t) = vf"/^+^-^V' = (i-*r^ (11-11) 



^ V k 

k=0 ^ 

with radius of convergence p = I, and thus 



6 l-t 

Hence, = ^'(1) = oo, and for any A G [0, oo), 

6A 



^(t) = ^.^. (11.12) 



(11.13) 

a + 6A ^ ^ 

The equivalent probability weight sequences are, by Lemma [4. 11 given by 

' '^''(l-^)"/^ 0<t<l, (11-14) 



$(t) V k 



which is the negative binomial distribution NBin(a/6, 1 — i) (where the pa- 
rameter a/6 is not necessarily an integer). The canonical choice, which by 
Theorems 110. 41 and 110. 71 is the asymptotic distribution of the number of balls 
of a given colour, is NBin(a/6, 1 — r) = NBin(a/6, a/(a + 6A)). See also Hoist 
[53^ and Kolchin [?!]. 

Note that the case 6 = (excluded above) means drawing with replace- 
ment; this is Example 111.11 which thus can be seen as a limit case. (This 

corresponds to the Poisson limit NBin(a/6, a/(a + 6A)) Po(A) as 6 — )• 0.) 

Example 11.5 (drawing without replacement). Consider again an urn with 
balls of n colours, with initially a balls of each colour. (This time, a ^ 1 is 
an integer.) Draw m balls without replacement, and let as above Yi be the 
number of drawn balls of colour i. (The case a = 1 yields the Fermi-Dirac 
statistics in Example 111.31 ) 
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Formally, this is the case 6 = — 1 of Example lll.4l and a similar calculation 
shows that 

p((yi, . . . , y„) = (yi, vn)) = ; (n-i^) 

hence this is the random allocation given by the weights 

«^fc=Q, A: = 0,1,... (11.16) 

We have thus ^{t) = (1 + 1)"", exactly as in Example 19.61 with d = a. 

The equivalent probability weight sequences are the binomial distributions 
Bi(a,p), < p < 1, and the canonical choice is, for < A < a, (vr^) = 
Bi(a, A/a), i.e. 

a\ f XV f a - Xy-'' /a\ A'=(a - A)"-'= 



See also Hoist [53] and Kolchin 76 1. 

Note that taking the limit as a — )• oo, we obtain drawing with replacement, 

which is Example lll.lt this corresponds to the Poisson limit Bi(a, A/o) — ^ 
Po(A) as a — )■ oo. 



Example 11.6 (random rooted forests [7a]). Consider labelled rooted forests 
consisting of n unordered rooted trees with together m labelled nodes. (Thus 
m ^ n.) We may assume that the n roots are labelled 1, . . . , n; let Tj be the 
tree with root i and let tj := ]Tj]. Then the node sets V{Ti) form a partition 
of {1, ... , m}, so Y27=i U = fn and (ti, . . . , t„) is an allocation in Bm,n-, with 
each ti ^ 1. Furthermore, given (ti, . . . , t„) S Bm,n with all ti ^ 1, the node 
sets V{Ti) can be chosen in _]^) ways, and given V{Ti), the tree 

Ti can by Cayley's formula be chosen in ways. (The trees are rooted 
but the roots are given.) Hence, the number of forests with the allocation 
(ti, . . . ,t„) is 

\ " " +ti-2 n ,ti-l 



n =(^- ^y- n (^fri)! = - n 



(11.18) 

Hence, a uniformly random labelled rooted forest corresponds to a random 
allocation Bm,n with the weight sequence Wk = k^~^/kl, k ^ 1, and wq = 0. 
Note that here wq = unlike almost everywhere else in the present paper; in 
the notation of Remark 110. 8| we have a = 1. (As discussed in Remark 110. 8| 
we can reduce to the case > by considering (ti — 1, . . . ,tn — 1), which 
is an allocation in Bm-n,n', this means that we count only non-root nodes. 
We prefer, however, to keep the setting above with wq = 0, noting that the 
results above still hold by Remark 110.81 ) 

If „ denotes the number of labelled rooted forests with m labelled 
nodes of which n are given as roots, then (111.18P implies 

F^,„ = (m-n)!Z(m,n). (11.19) 
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It is wel l-kno wn that „ = nm™ " , a formula also given by Cayley ^] , 
see e.g. fl03l . Proposition 5.3.2] or [99']; thus 

Z{m,n) = - -. 11.20 

[m — n)\ 

We have 

k=l 

the well-known tree function (known by this name since it is the exponential 
generating function for rooted unordered labelled trees, cf. Example 19. 2p . 
Note that T{z) satisfies the functional equation 

r(z) = ze^(^); (11.22) 

see e.g. [iO, Section II. 5]. Equivalently, 

z = T{z)e-^^'\ (11.23) 

which by differentiation leads to 

Hence, 

t^'(t) 1 
^(t) ■= — = . 11.25 

^ ^ $(t) 1 - T{t) ^ ' 

By ()11.2ip and Stirling's formula, ^{t) has radius of convergence p = e~^. 
Furthermore, (111. 231) imphes that $(p) = T{e~^) = 1. Hence, (I11.25P yields 
V = ^{p) = oo, and if 1 ^ A < oo, then A = ^(r) is solved by 

1 A-1 

T{t) = 1 - ^ = (11-26) 

and thus, using ()11.23p . 

r = ^e-(^-i)/\ (11.27) 
A 

The probability weight sequences equivalent to {wk) are by Lemma l4.ll 
given by, substituting x = T{t), and thus t = xe~^ by (lll.23p . 

= l^y,^ = !l_L = , k^l, (11.28) 

where ^ t ^ and thus ^ x ^ 1. This is known as a Borel distribution] 
it appears for example as the distribution of the size |T| of the Galton- 
Watson tree with offspring distribution Po(x). (This was first proved by 



Borel jl8j ]. It follows by Theorem 114.51 below^with the probability weigh t 
sequence Po(x); see also Otter 93|, Tanner [IOTI ]. Dwass [s^, Takacs lOd ]. 
Pitman [99*].) It follows that the random rooted forest considered here has 
the same distribution as the forest defined by a Galton-Watson process 
with starting with n individuals (the roots) and Po(x) offspring distribution. 
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conditioned to have total size m; cf. Example 111.81 b elow . See further Kolchin 



7g and Pavlov [9Q]. 



In particular, the canonical distribution for a given A ^ 1 is, using (111.27p . 

'''^ " T{T)k\ ~ k\ \ \ ) ^ ■ ^^^-^^^ 

By Theorems 110.41 and 110. 7| and Remark 110.81 this is the asymptotic dis- 
tribution of the size of a given (or random) tree in the forest, say Ti. The 
asymptotic distribution of |Ti| is thus the distribution of the size \T\ of a 
Galton- Watson tree with offspring distribution Po(l — 1/A). Moreover, Ti 
is, given its size |ri|, uniformly distributed over all trees on |Ti| nodes, and 
the same is true for the Poisson Galton- Watson tree T by Example 19.21 

Consequently, Ti T as n — )• oo with m/n — )• A. (We may regard Ti as 
an ordered tree, ordering the children of a node e.g. by their labels.) 

The same random allocation Bm,n also describes the block lengths in 
hashing with linear probing; see Janson [s^. Indeed, there is a one-to-one 
correspondence between hash tables and rooted forests, see e.g. Knuth [tsI . 



Exercise 6.4-31] and Chassaing and Louchard [24 1. 



Example 11.7 (random unrooted forests). Consider labelled unrooted for- 
ests consisting of n trees with together m labelled nodes. (Thus m ^ n.) 
We may assume that the n trees are labelled Ti, . . . , T„; let := |Tj|. As 
in Example 111.61 the node sets V{Ti) form a partition of {l,...,m}, so 
Y17=i U = and (ti, . . . ,tn) is an allocation in Bm,ni with each ti ^ 1. In 
the unrooted case, given (ti, . . . ,t„) G Bm,n with all ti ^ 1, the node sets 
V{Ti) can be chosen in (^^ ""^ ) ways, and given V{Ti), the tree Tj can by 

Cayley's formula be chosen in ways. Hence, the number of unrooted 
forests with the allocation (ti, . . . , t„) is 

. "J n nV- <"™' 

1=1 i=\ 

Hence, a uniformly random labelled unrooted forest corresponds to a random 
allocation Bm,n with the weight sequence Wk = k^~^/k\, k ^ 1, and wq = 0. 
As in Example 1 11. 61 we have wq = 0, but this is no problem by Remark llO.81 
If „ denotes the number of labelled unrooted forests with m labelled 
nodes and n labelled trees, then (lll.30p implies 

F;^,, = m!Z(m,n). (11.31) 

There is no simple general formula for F^ „ , as there is for the rooted forests 
in Example lll.6t and hence no simple formula for Z{m, n). Asymptotics are 
given by Britikov |2q|. (See Example 117.161 for one case. The asymptotic 
formula when m/n — t- A > 2 follows similary from Theorem I18.33( ii). and 
when m/n — )■ A < 2 with m = An + o{y/n) from Theorem 117. 121 ) 
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We have 

oo , fc— 2 

^(*) ■■= E -jr^' = ^(*) - '2nt?, (11.32) 

k=l 

where T(t) is the tree function in pi.2ip . (The latter equahty is well-known, 
see e.g. [40, II. 5. 3]; it can be shown e.g. by showing that both sides have the 
same derivative T{t)/t; there are also combinatorial proofs.) Hence, using 

t^'(t) T(t) 1 

cf. the similar (|11.25p in the rooted case. 

As for (lll.2ip . ^ has the radius of convergence p = e~^, but now, by 
()11.33p . h' = ^'(p) = 2 is finite, so there is a phase transition at A = 2. 
The parameter r is by the definition in Theorem 17.11 and (I11.33P given by 
T{t) = 2 - 2/A = 2(A - 1)/A for A < 2; thus, using (fTL23]l . 

\e-\ A ^2. ^ ' 

The probability weight sequences equivalent to (wk) are by Lemma HTTl 
given by, again substituting t = xe~^ oi x = T{t), 

_ k'^'H^ _ x(fcx)^-^e-^^' 

- T{t){l - T{t)/2)k\ " (1 - x/2)k\ ' ^ ' ^ "^^^ 

where ^ t ^ and thus ^ x ^ 1. In particular, the canonical 
distribution for a given A ^ 1 is, by (|11.34p and (111.35p . for A; ^ 1, 

k- 



T{T){l-T{r)/2)k\ \2jlzle-l. x>2 



(11.36) 

By Theorems 110.41 and 110.71 and Remark 110. 8^ this is the asymptotic distri- 
bution of the size of a given (or random) tree in the forest, say Ti. 

We shall see in Theorem 118.481 that the phase transition at A = 2 is seen 
clearly in the size of the largest tree in the forest: if m/n —)• A < 2, then the 
largest tree is of size Op (log n), while ifm/n— )-A>2, then there is a unique 
giant tree of size (A — 2)n + Op(n); for details see Theorems 118.331 and 118.48} 
and, more generally, Luczak and Pittel [83]. This is thus an example of the 
condensation discussed after Theorem ll0.4l (and similar to the condensation 
in Theorem 17.11 when u <1). 

Example 11.8 (simply generated forests and Galton-Watson forests). A 
simply generated forest is a sequence (Ti, . . . , r„) of rooted trees, with weight 

n 

w{Ti,...,Tn):=J{w{Ti), (11.37) 

i=l 
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where w{Ti) is given by ()2.3p . for some fixed weight sequence w. A simply 
generated random forest with n trees and m nodes, where n and m are given 
with m ^ n, is such a forest chosen at random, with probabihty proportional 
to its weight. Note that in the special case n = 1, this is the same as a 
simply generated random tree defined in Section [2j More generally, for any 
n, a simply generated random forest (Ti, . . . , T„) is, conditioned on the sizes 
|Ti|, . . . , \Tn\i a sequence of independent simply generated random trees with 
the given sizes (all defined by the same weight sequence w). Moreover, the 
sizes (|Ti|, . . . , |r„|) form an allocation in Bm,n, and it is easily seen that this 
is a random allocation Bm,n defined by the weight sequence (^fc)^o, where 
Zk is the partition function (j2.5p for simply generated trees with weight 
sequence (and Zq = 0). 

A simply generated random forest can thus be obtained by a two-stage 
process, combining the constructions in Sections [2] and [TOj Note that equiv- 
alent weight sequences w yield equivalent weight sequences (Z^) by (j4.3p . 
and thus the same simply generated random forest. 

In the special case when w is a probability weight sequence, we also 
define a Galton-Watson forest with n trees, for a given n, as a sequence 
(71, . . . , Tn) of n i.i.d. Galton- Watson trees; it describes the evolution of a 
Galton-Watson process started with n particles. (It can also be seen as a 
single Galton-Watson tree T with the root chopped off, conditioned on the 
root degree being n, provided that this root degree is possible.) Note that 
the probability distribution of the forest is given by the weights in ()11.37p . 
Hence, in the probability weight sequence case, the simply generated ran- 
dom forest equals the conditioned Galton- Watson forest with n trees and m 
nodes, defined as a Galton-Watson forest with n trees conditioned on the 
total size being m; in other words, it describes a Galton-Watson process 
started with n particles conditioned on the total size being m. 

Random forests of this type are studied by Pavlov [96(], see also Flajolet 
and Sedgewick (4ol . Example III. 21]. 

For example, taking Wk = ^/kl, we have by (I9.11|) Zk = k^^^/k\, k ^ 1; 
this is the weight sequence used in Example 111.61 so we obtain the same 
random allocation of tree sizes as there; moreover, given the tree sizes, the 
trees are uniformly random labelled unordered rooted trees by Example 19.21 
Consequently, for this weight sequence, the simply generated random forest 
is the random labelled forest with unordered rooted trees in Example 111.61 
The same random forest is obtained by the equivalent probability weight 
sequence Wk = x^e~^/kl, with < x ^ 1, so it equals also the conditioned 
Galton-Watson forest with off^spring distribution Po(3;), cf. Example 111.61 

Another example is obtained by taking Wk = 1 for all /c ^ 0. Then 
every forest has weight 1, so the this simply generated random forest is a 
uniformly random forest of ordered rooted trees. (An ordered rooted forest.) 
By Example 19.11 the weight sequence (Z^) is then given by the Catalan 
numbers in Zk = Ck-i = {2k - 2)\/{k\ {k - 1)!), k^l. 
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Further examples are given by starting with the other examples of random 
trees in Section [H 

We shall see in Theorem 117.111 that if the weight sequence w is as in 
Theorem 17. 11 and further span(w) = 1, u ^ 1 and < oo, then 

\/27rcj2 \ T J 

Recalling Z(t/^(t)) = r by (j7.6p . we may replace Zk by the equivalent 
probability weight sequence 

■■= (^y=^(^y- (11-39) 



Z(r/«J>(r)) \^t)J t V$(r)y V27Ta- 

so we have the asymptotic behaviour ~ ck^^^"^ for every such weight 
sequence w, where only the constant c = l/\/27r cr^ depends on w. This 
explains why random forests of this type have similar asymptotic behaviour, 
in contrast to the unrooted forests in Example 111.71 

Example 11.9. Let, as in Example 19. 7|, Wk = {k + 1)"^ for some real 
constant (3. Then p = 1. As shown in Example 19.71 = cxd if /3 ^ 2, and 
< cxD if /? > 2; in the latter case, is given by ( I9.25p . This example is 
studied further in e.g. Bialas, Burda and Johnston [14] . 

Example 11.10 (power-law). More generally, suppose that ~ ck^^ as 
k — )• oo, for some real constant /3 and c > 0, i.e., that Wk asymptotically 
satisfies a power-law. Qualitatively, we have the same behaviour as in Ex- 
amples 19.71 and 111.91 but numerical values such as the critical (3 in (j9.26p 
will in general be different. 

We repeat some easy facts: first, p = 1, cj = oo and span(w) = 1. 



If — oo < /? ^ 1, then $(/o) = $(1) = oo; hence = oo by Lemma \3A \iv 
If 1 < /3 ^ 2, then $(p) < oo but ^'{p) = YlT=o = oo; hence again 

1/ = ^(p) = oo by (f3TT]) . 

On the other hand, if /3 > 2, then $(1) < oo and $'(1) < oo, and thus 

u < oo hj ()3.1ip . Summarising: 

1/ < oo /3 > 2. (11.40) 

In the case /3 > 2, there is thus a phase transition when we vary A. 

Suppose (3 > 2, so v < oo. li X ^ u, then t = p = 1, and the canonical 
distribution (vTfc) is by (jlO.lSp given simply by nk = Wk/^{1)- This distri- 
bution then has mean = < oo by ()10.14p : since vr^ x k~^ as A: — )• oo, the 
variance cr^ = oo if 2 < /3 ^ 3, while < oo when /3 > 3. 

Note that Examples 111.61 and 111.71 with random forests are of this type, 
provided we replace Wk by the equivalent Wk '■= e~^Wk] Stirling's formula 
shows that Wk ~ ck~^ where /3 = 3/2 for rooted forests and /3 = 5/2 for 
unrooted forests (and c = 1/^/2tt). The different values of /? explains the 
different asymptotical behaviours of these two types of random forests: by 
the results above, the tail behaviour of Wk implies that i/ = oo for rooted 
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forests but < oo for unrooted forests, as we have shown by exphcit cal- 
culations in Examples 111.61 and 111.71 Recall that this means that there is 
a phase transition and condensation for high mjn in the unrooted case but 
not in the rooted case. 

More generally, (111.39P shows that simply generated random forests un- 
der weak assumptions have the same power-law behaviour of the weight 
sequence with /3 = 3/2 as the special case of (unordered) rooted forests in 
Example 111.61 Thus v = oo and there is no phase transition. (At least not 
in the range m = 0{n) that we consider. Pavlov (oil show a phase transition 
at m = 6(n^).) 



Example 11.11 (unlabelled forests). Consider, as Pavlov 93], rooted forests 
consisting of n rooted unlabelled trees, assuming that the trees, or equiva- 
lently the roots, are labelled 1, . . . , n, but otherwise the nodes are unlabelled. 
A uniformly random forest of this type with m nodes can be seen as balls- 
in-boxes with the weight sequence (tk), where is the number of unlabelled 
rooted trees with k nodes. In this case there is no simple formula for the 
generating function ^{z), but there is a functional equation, from which it 
can be shown that tk ~ cik^^^^ , where p ^ 0.3382 as usual is the radius 
of convergence of $(-z) and ci ^ 0.4399, see Otter [s^l or, e.g., Drmota (ssl . 
Section 3.1.5]. Furthermore, <&(/?) = 1; thus {tkP^) gives an equivalent prob- 
ability weight sequence with tj-p^ ~ cik~^/'^ as k ^ oo. The asymptotic 
behaviour of the weight sequence is thus the same as for labelled rooted 
forests in Example 111.61 more generally for Galton- Watson forests (un- 
der weak conditions) in Example 111.81 ^-^d we expect the same type of 
asymptotic behaviour in spite of the fact that the unlabelled forest is not 
simply generated; this is seen in detail in Pavlov [9^ for the size of the 
largest tree. In particular, we have = oo by Example 111.101 and ()11.40p . 
and thus there is no phase transition at finite A. 



Similarly, Bernikovich and Pavlov [l2| ] considered unrooted forests con- 



sisting of n trees labelled 1, . . . , n with a total of m unlabelled nodes. These 
are described by the weight sequence (t^) where ik is the number of un- 
rooted unlabelled trees with k nodes. Again, there is no no simple for- 
mula for the generating function ^{z) := Ylik^k^^ but there is the relation 
$(z) = ^{z) - \^{zf + \^{z^) found by Otter [sS], which leads to the as- 
ymptotic formula ij. ~ C2k~^/'^ p~^ , where p is as above and C2 ~ 0.5347, see 
also Drmota [33, Section 3.1.5]. In this case, {ikp^ /^{p)) gives an equivalent 
probability weight sequence which is ~ {c2/^{p))k~^^^ as /c — t- oo, which is 
the same type of asymptotic behaviour as for the weight sequence for la- 
belled unrooted forests in Example 111.71 we thus expect the same type of 
asymptotic behaviour as for those forests. In particular, < oo by Ex- 
ample [UTTOl a numerical calculation gives v := p^'{p)/^{p) 2.0513, see 
Bernikovich and Pavlov [l^. 

Note that both types of "unlabelled" forests considered here have the trees 
labelled (i.e., ordered). Completely unlabelled forests cannot be described 
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by balls- in-boxes (as far as we know), since the number of (non- isomorphic) 
ways to number the trees depends on the forest. 

Example 11.12 (the backgammon model). The model with = l/k\ for 
k ^ 1 as in Example Ill.H but wn > arbitrary, was considered by Ritort 
[Too!] and Franz and Ritort 41 , , who called it the backgammon model. 
We have 

°° 

$(t) = u;o + ^— = e* + u;o-l (11-41) 

k=l 

and 

^-(t) = = — . (11.42) 

^ ^ $(t) 1 + (-u^o - l)e-* ^ ^ 

Thus p = V = oo. The equation ^{t) = A can be written 

(r - A)e^ = {wo - 1)A, (11.43) 
and the solution can be written 

T = A + W{{wo - l)Ae-^) = A - T((l - wo)Ae-^) , (11.44) 

where W{z) is the Lambert W function [2^ defined by W{z)e^^^'^ = z 
and r(z) is the tree function in (jll.2ip (analytically extended to all real 
z < e-i); note that W{z) = -T{-z) by IHTTTm . see [26^. 

The canonical probability weight sequence (jl0.13p is, using (jll.42p and 
*(r) = A, 

and ttq = \T~^e~'^WQ. 

Example 11.13 (random permutations and recursive forests). Consider 
permutations of {1, . . . ,m} with exactly n cycles. Let us label the cycles 
1, . . . , n, in arbitrary order, and let yi be the length of the i:th cycle. Then 
(yi,...,yn) is an allocation in Bm,n with each yi ^ 1, and for each such 
(yi, . . . , yn) G Bm,n, the number of permutations with yi elements in cycle i 
is 



yi, ■■■,yn 



)n ^ -I 

ll{y,- 1)1= mill-, (11.46) 
»=i i=i 



since there are (y — 1)! cycles with y given elements. Consequently, a uni- 
formly random permutation of {1, ... , m} with exactly n cycles corresponds 
to a random allocation Bm,n defined by the weights ?i;o = and = 1/k 
for k ^ 1. Note that here, as in Example 111.61 wq = 0, and Remark 110.81 
applies with a = 1. 

The number of permutations with n (unlabelled) cycles is by (|11.46p 

ml Z(m,n)/n\, (11.47) 

where we divide by n! in order to ignore the labelling above. 
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The same balls- in-boxes model with Wk = 1/^ k ^ 1, also describes 
random recursive forests, see Pavlov and Loseva [98|. 
We have 

oo 
k=l 

with radius of convergence p = 1 and 

m .= ^ = i (11 49) 

m -(i-t)iog(i-t)' (^'-^'^ 

sou = ^-(1) = oo, cf. Example [lLlO](/3 = 1). 

The equivalent probability weight sequences are by Lemma l4.ll given by 

^^= ^|ln(f-x)| ' °<^<^' ^''-''^ 

with probability generating function $(x2;)/$(x) = log(l — x2;)/log(l — x). 
This distribution is called the logarithmic distribution. See further Kolchin, 
Sevast'yanov and Chistyakov [77] and Kolchin jTcj ]. 

By Remark I10.8| we obtain results on random permutations with m cy- 



cles as m/n — )• A G [l,oo), see for example Kazimirov [7]|. However, it is 
of greater interest to consider random permutations without constraining 
the number of cycles. This can be done using methods similar to the ones 
used here, but is outside the scope of the present paper; see e.g. Kolchin, 



Sevast'yanov and Chistyakov \7o\, Kolchin 76|] and Arratia, Barbour and 
Tavare [3]. Note that even if we condition on the number of cycles, a typical 
random permutation of {1, ... , m} has about logm cycles, so we are inter- 
ested in the case n ~ log m and thus m/n — )• oo, which we do not considered 
here. 

Other random objects that can be decomposed into components can be 
studied similarly, for example random mappings [zl] ; our results apply only 
to random objects with a given number of components (in some cases), but 
similar methods are useful for the general case; see Kolchin [76] and Arratia, 
Barbour and Tavare 



12. Preliminaries 



Proof of Lemma \3.1[ (i) Since ^'{t) = X^^Lo same radius 

of convergence p as and ^{t) ^ wq > for t ^ 0, it is immediate that ^ is 
well-defined, finite and continuous for t G [0,p). Furthermore, if < t < p, 
then t^'{t) is by (I4.10p the variance of a non-degenerate random variable, 
and thus t^''(t) > 0. Hence ^'(t) is increasing, completing the proof of |(i) 



(ii) 



oo, the claim is just the definition of ^{p) in Section [2j 
(Note that the existence of the limit follows from (i) ) We may thus assume 
^{p) < oo; then t p implies ^(t) —?- ^{p) < oo and ^'{t) ^'{p) ^ oo 
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by monotone convergence, and thus 



m Hp) 



(iii) The case p = is trivial, and the case p > follows from (i) and 



(iv) For any £ ^ 0, 



22k=o Wkt" 
If p < oo and ^{p) = oo, we thus have. 



ECO ,u 
k=0 Wkt^ 



(12.1) 



m 



as t p, 



so ^{p) — i ^ 0. Since i is arbitrary, this shows ^(p) = oo, proving (iv 



(v) If p = oo, choose i with we > 0. Then ()12.ip implies 

as t ^ oo. 



so ^'(oo) - 0. Hence, ^'(oo) ^ sup{£ : Wi > 0} = uj. 
Conversely, 



for ah t G [0,p), 



so ^'(p) ^ OJ, completing the proof of (v) 



Finally, (13. 9p follows from (i) and (ii 



□ 



Remark 12.1. Alternatively, the fact that ^(t) is increasing can also be 
seen as follows: Let < a < 6 < p and let y be a random variable with 
distribution F{Y = k) = Wka'' /<^>{a) (cf. Lemma S^l)- Then ^'(o) = EY 
and ^'(6) = E{Y{b/a)^) / E{b/a)^ , so ^'(a) ^ ^-(6) is equivalent to the 
correlation inequality E(Y{b/a)^) ^ EYE{b/a)'^ , which says that the two 
random variables f{Y) := Y and g{Y) := (b/a)^ are positively correlated; 
it is well-known that this holds (as long as the expectations are finite) for 
any two increasing functions / and g and any Y, see [HO, Theorem 236] 
where the result is attributed to Chebyshev, and it is easy to see that, in 
fact, strict inequality holds in the present case. (The latter inequality is an 
analogue of Harris' correlation inequality |51[ for variables Y with values in 
a discrete cube {0, 1}^; in fact, the inequalities have a common extension 
to variables with values in M^. Cf. also the related FKG inequality, which 
extends Harris' inequality; see for example [i^ where also its history is 
described.) 

For a third proof that ^{t) is increasing, note that (13. 7p shows that ^ is 
(strictly) increasing if and only if log<I'(e^) is (stri ctly ) convex, which is an 
easy consequence of Holder's inequality, (See e.g. [3l|, Lemma 2.2.5(a)] and 
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note that $(e^) = YlT=o^'^^''^k is the moment generating fmiction of {wk) 
in the case that (wk) is a probabihty weight sequence.) 

Lemma [HTT] shows that ^' is a bijection [0, p] — )• [0, '^{p)] = [0, u], so it has 
a weh-defined inverse : [0,iy] [0,p]. We extend this inverse to [0, oo) 
as follows. 

Lemma 12.2. For x ^ define r = t{x) £ [0, oo] by 

t{x) := sup{t ^ p : ^'(t) ^ x}. (12.2) 

Then t{x) is the unique number in [0, p] such that ^{t{x)) = x when x ^ 
z^, and t(x) = p when x v. Furthermore, the function x i— )• t(x) is 
continuous, and, for any x ^ 0, 

^{t{x)) =m.m.{x,v). (12.3) 

If x < Lo, then ^ t{x) < oo and < <I>(t(x)) < oo. On the other hand, if 
X ^ oj, then t(x) = ^{t{x)) = oo. 

Proof. By Lemma [3.1l and the definition (IS.lOp . ^ is an increasing continuous 
bijection [0, p] — )• [0, ^'(/o)] = [0, i^]; thus if ^ x ^ z^, there exists a unique 
^~^{x) £ [0,p] with ^{^~^{x)) = X, and (112. 2p yields r(x) = ^'"^(x). Since 
^ is a continuous bijection of one compact space onto another, its inverse 
"^^^ : [0, u] — )• [0, p] is continuous too; thus x i— )• r(x) = ^~^(x) is continuous 
on [0,z^]. Furthermore, (|12.3p holds for x ^ z^. 

If X ^ = 'I'(p), then (|12.2p yields t(x) = p, and thus ^'(t(x)) = ^{p) = 
u, so (|12.3p holds in this case too. 

Combining the two cases we see that x i— )• t(x) is continuous on [0,oo), 
and that ^lT3\i holds. 

Now suppose that x < a; and t(x) = oo. Since t(x) ^ p we then have 
p = oo, and Lemma [STjTv)! yields ^(r(x)) = ^{p) = co > x, contradicting 
(jl2.3p . Thus t(x) < oo when x < uj. Furthermore, if $(t(x)) = oo, then 
r(x) = p, since #(t) < oo for t < p, and thus <J'(/o) = oo. If further 



X < u>, and thus p = t(x) < oo as just shown, then Lemma [STjTiv) would 



give ^(r(x)) = ^(p) = oo, again contradicting (|12.3p since x < oo. Thus 
$(t(x)) < oo when x < co. 

Conversely, if x ^ a;, then w < oo, so <I>(t) is a polynomial and p = oo. 
Lemma r3.1|(v)| shows that ^{p) = co ^ x, so ()12.2p yields r(x) = p = oo, 
whence also <I>(t(x)) = $(00) = 00. □ 

Next, we investigate when Z{m,n) > 0. We say than an allocation 
(yi, . . . ,y„) of m balls in n boxes is good if it has positive weight, i.e., if 
yi G supp(w) for every i. Thus, Z{m, n) > if and only if there is a good 
allocation in Bm,n', in this case, the random allocation Bm,n is defined and 
is always good. 

Provided m is not too small or too large, the m and n for which good 
allocations exist are easily characterised; the following lemma shows that a 
simple necessary condition also is sufficient. (The exact behaviour for very 
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small m is complicated. The largest m such that Z(m, n) = for all n is 
called the Frobenius number of the set supp(w); it is a well-known, and in 
general difficult, problem to compute this, see e.g. [lol ]. The case when m 
is close to ojn (with finite oj) is essentially the same by the symmetry in 
Remark [Taini) 

Lemma 12.3. Suppose that wq > 0. 

(i) If Z{m, n) > 0, then span(w) | m and ^ m ^ ojn. 

(ii) If OJ < oo, then there exists a constant C (depending onw) such that 
if span(w) | m and C ^ m ^ ojn — C , then Z{m, n) > 0. 

(iii) If (jO = oo, then for each C < oo, there exists a constant C (depend- 
ing on w and C ) such that i/span(w) | m and C ^ m ^ C'n, then 
Z{m, n) > 0. 

Proof, (i): Z{m, n) > if and only if m = Y17=i Vi some yi with Wy^ > 0, 
i.e., Ui G supp(w). This implies ^ t/j ^ w and span(w) | for each i, and 
the necessary conditions in (i) follow immediately. 

(ii): We may for convenience assume that span(w) = 1, see Remark 110. 9| 
then, by ()3.3p . supp(w) \{0} is a finite set of integers with gr eates t common 
divisor 1. Thus, by a well-known theorem by Schur, see e.g. [l09l . 3.15.2] or 



[40|, Proposition IV. 2], there is a constant Ci such that every integer m ^ Ci 
can be written as a finite sum m = J2i Vi with y-i G supp(w) (repetitions 
are allowed); i.e. we have a good allocation of m balls in some number £(m) 
boxes. Choose one such allocation for each m G [Ci,Ci -l-o;), and let C2 be 
the maximum number of boxes in any of them. 

If Ci ^ m ^ ujn—C2UJ, let a := [{m—Ci)/uj\ . Then m—aoj G [Ci, Ci-\-uj), 
and has thus a good allocation in at most C2 boxes. We add a boxes with 
u) balls each, and have obtained a good allocation of m balls using at most 

C2 + a = C2+[{m- Ci)/uj\ ^€2 + [{ujn - C20J - Cx)Ioj\ ^ n 

boxes. Hence we may add empty boxes and obtain a good allocation in Bm,n- 
(Recall that G supp(w).) Thus Z{m,n) > when Ci ^ m ^ wn — C2UJ. 

(iii): We may again assume span(w) = 1. Let be a large integer and 
consider the truncated weight sequence w'^^) = {w^f'^) defined by 

wf^:=h' (12.4) 
[0, k>K; ^ ^ 

we assume that K G supp(w) and that K is so large that K C + 1 and 
span(w^^^) = span(w) = 1. Then uj{w^^^) = K, and (ii) shows that for 
some C3, if C3 ^ m ^ Kn — C3, then Z{m,n;w) ^ Z{m,n;w^^^) > 0. 
Hence, if m ^ C3 and Z{m, n) = 0, then Kn — C3 < m ^ C'n, and thus 
n < C3, whence m < C'C-^. Consequently, if C'C^, ^ m ^ C'n, then 
Z{m,n)>0. □ 

Remark 12.4. In the case uj = 00, it is not always true that there is a 
constant C such that Z{m, n) > whenever m ^ C. For example, suppose 
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that Wk = 1 when A: = or /c = j! for some j ^ 0, and Wk = otherwise. 
Then Z{m, n) = when m = (n + 1)! — 1 and n ^ 2. 

Remark 12.5. Lemma 112.31 is easily modified for the case wq = 0; if a := 
min{fe : Wk > 0} as in Remark 110. 8^ then the necessary condition (i) is 
an ^ m ^ ojn and span(w) | (m — an), and again this is sufficient if m stays 
away from the boundaries. 

13. Proofs of Theorems [rOtiTTTTl 

We now prove the theorems in Section \T0\ we begin with some lemmas. 

First we state and prove a version of the local central limit theorem (for 
integer-valued variables) that is convenient for our application below. We 
will need it for a triangular array, where the variables we sum depend on n. 

We define the span of an integer-valued random variable to be the span 
of its distribution, defined as in p.2p . 



Lemma 13.1. Let ^ and . . . be integer-valued random variables 

with ^^"^ — ^ ^ as n — )■ oo, and let si^'^ := '}2'i=i^i"^ > where are inde- 
pendent copies of Suppose further that ^ is nan- degenerate, with span 
d and finite variance cj^ > 0, and that sup„E|^(")p < oo. If d > \, we 
assume for simplicity that d \ ^ and d \ ^("^ for each n. 

Let m = m{n) be a sequence of integers that are multiples of d, and 
assume that E^(") = m{n)/n. Then, as n ^ oo, 

F(5(") =m) = ^^M. (13.1) 
v27r cT^ra 



Proof. The proof uses standard arguments, see e.g. iKolchinI [761 . Theorem 
1.4.2]; we only have to check uniformity in ^("^ of our estimates. 

If the span d > 1, we may divide S,^"'^ and m by d, and reduce to the 
case d = 1. Hence we assume in the proof that span(^) = 1. 

Let ip{t) := Ee'*^ and ipn{t) := Ee'*^ " be the characteristic functions of 
^ and Further, let (fn{t) '■= e~'*™'''"(/3n(i) be the characteristic function 
of the centred random variable ^^"^ — E^'^"^-' = ^("^ — m/n. 

Then S*^"^ has characteristic function Lpn{t)^, and thus, by the inversion 
formula and a change of variables, 

1 /""^ 

= ir^ ^nix/V^Tl{\x\<Tr^}dx. (13.2) 

Let (T^ be the variance of Since E|^(")|^ are uniformly bounded, 

(T^ < oo; moreover, the random variables S,^'^^ are uniformly square integrable 
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and it follows from ^("^ ^ that cr^ — )• a"^ . (See e.g. GutI 49, Theorems 



5.4.2 and 5.4.9] for this standard argument.) In particular, cr^ ^ c7^/2 for 
all sufficiently large n; we consider in the remainder of the proof only such 
n. 

Since (^n(i) is the characteristic function of .^'■"'^ — E^^"^ which has mean 
and, by assumption, an absolute third moment that is uniformly bounded, 
we have by a standard expansion (see e.g. [ioj, Theorems 4.4.1]) 

^n{t) = 1 - lalt^ + 0(E If ) = 1 - 1^2*2 + 0(|f ), (13.3) 

uniformly in all n and t. In particular, for any fixed real x, 

^n{x/M = 1 - ^ + 0{n-'/') = 1 - (13.4) 
and thus 

^nix/V^r ^ e-'^'"'/^^ (13.5) 
We are aiming at estimating the integral in (jl3.2p by dominated convergence, 
so we also need a suitable bound that is uniform in n. 

We write as |^n(i) - (1 - ^o-^i^)! ^ Ci\tf. Let 5 := a^/8Ci > 0. 

Then, if |t| ^ 6, recalling our assumption cr^ ^ ^cj^. 

For 5 ^ |t| ^ vr we claim that there exists no and r] > such that if 
n ^ no and (5 ^ |t| ^ vr, then 

|^„(t)Kl-r?. (13.7) 

In fact, if this were not true, then there would exist sequences k and 

tk G [(5, tt] (by symmetry, it suffices to consider t > 0) such that \^nkiik)\ = 
Wnk{tk)\ > 1 — 1/k. By considering a subsequence, we may assume that 

tk ^ too as k ^ oo for some too G [<5, vr] . Since ^„ ^, ifn^ {t) — >• (/'(O 
uniformly for |t| ^ vr, and thus ipn^{tk) — s- f{too). It follows that |(/?(too)| = 1 
for some too £ ['^j'''"], but this is impossible when span(^) = 1, as is well- 
known (and easily seen from Ee'*°°*^^~^) = |(^(too)P = li where ^' is an 
independent copy of ^). This contradiction shows that (|13.7p holds. 

We can combine (jl3.6p and (jl3.7p : we let ci := minjcj^/S, rz/vr^} and 
obtain, for n ^ no, 

|^n(i)| ^ 1 — cit^ ^ exp(— cit^), |t| ^ vr, 

and thus 

\(fn{x / \^)\^ ^ exp(— cix^), |x| ^ ^^^/n. 
This justifies the use of dominated convergence in (|13.2p . and we obtain by 
([133]) 

/oo 
(^„(x/V^)"l{|x| <TT^}dx 
-oo 

/oo 
-oo 
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which yields ()13.1I) . (Recall that we have assumed d = 1.) □ 

Remark 13.2. A simple modification of the proof shows that the result still 
holds if the condition E ^("^ = m(n) /n is relaxed to m{n) = n 

Furthermore, for any m = m{n), F{S^^ = m) ^ |^„(t)|" dt, and it 

follows by the proof above that 

P(5W=m)^^±^, (13.8) 
v27r cr^n 

uniformly in all m € Z. 

Moreover, both Lemma Il3.ll and the remarks above hold, with only mi- 
nor modifications in the proof, also if the condition sup^El^^^^p < oo is 
relaxed to uniform square integrability of In particular, if ^'•"^ = ^, this 
assumption is not needed at all; then the assumption cr^ < oo is the only 
moment condition that we need. (This is the classical local central l imit 
theore m for disc rete distributions, see e.g. Gnedenko and Kolmogorovl (46l . 



49] or lKolchinI [73, Theorem 1.4.2].) 



We use Lemma 113.11 to obtain lower bounds of the (rather weak) type 
exp(o(n)) for P(5„ = m) in the case of a probability weight sequence, for 
suitable m. We treat the cases p > 1 and /O = 1 separately. 

Lemma 13.3. Let w he a probability weight sequence with < u;o < 1 and 
p > 1. Let ^1,^2) •• • be i.i.d. random variables with distribution w and let 

Sn '■= Yli=l ^i- 

Assume that m = m{n) are integers that are multiples of d := span(w), 
and that m{n)/n ^¥.£^1. Then 

P(5„ = m) = Z(m,n) = e°("\ 

Proof. Let ^ := and A := E^ = ^>'(1) = ^-(1). Since p > 1, we have 
V > ^'(1) = A. Thus, by assumption, m/n X < ly, so m/n < v for all 
large n; we consider in the sequel only such n. By Lemma l3.ll we may then 
define r„ G [0,p) by ^'(t„) = m/n. Since ^'"^ is continuous on [0,i/), and 
^(1) = E^ = A, we have 

Tn = ^~^[mln) ^^-^{\) = \ asn^oo. (13.9) 

Let ^("^ have the conjugate distribution 

k 

P(e(") =k) = -^.Wk, k ^ 0; (13.10) 

by Lemma 14.21 this is a probability distribution with expectation 

E^(") = ^(r„) = m/n. (13.11) 

The conditions of Lemma Il3.ll are easily verified: Since r„ — t- 1 by (jl3.9p . 
we have P(^("') = k) ^ Wk = F{£, = k) and thus ^'^"^ Furthermore, 
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taking any £ (li/o) and considering only n that are so large that Tn < t^, 

''Wk < oo. 



OO ^ ^ OO 



^(Tn) $(0) 
k=0 ^ ^ ^ k=0 

Furthermore, if d = span(^), then Wk > =^ d | A; by ()3.3p : thus d \ ^ and 
d I (a.s.). Lemma [13. II thus applies, and if w^") denotes the distribution 
of C^") in (fmo]) . then by (fTOSD and (fmi) . 



where cj^ := Var^. By (|10.9p . we have Z(m,n; w^")) = $(t„) "r™Z(m, n), 
and thus, recalling that r„ — )■ 1 < p and hence *I'(t„) — t- = 1, 

F{Sn = m) = Z{m,n) = r-'"$(T„)'^Z(m, n; w^")) 

= exp(— mlogr„ + nlog<I>(Tn) + log Z(m, n; w^""-')) 

= exp(o(n)) . □ 

Lemma 13.4. Let w be a probability weight sequence with < t^o < 1 and 
p = 1. Let £,1,^,2, ■■ ■ be i.i.d. random variables with distribution w and let 

Assume that m = m{n) are integers that are multiples of d := span(w), 
and that m{n)/n — ?• A < cxd with A ^ lE^^i- Then 

Proof. Let K he a large integer and consider the truncated weight sequence 
w(^) = {w^^^) defined by, as in ([TTI]) . 

^«f):=<;7' (13.13) 




having generating function <I>A'(t) = Ylk=o'^kt'', and the corresponding 
^'i^(t) := We assume that K is so large that span(w(^)) = 

span(w), and that K > k ioic some k > X with Wk > 0. (Such k > X ex- 
ists since p < oo.) Thus the weight sequence w^''^) has, by Lemma I3.]|(v)[ 
u('w^^^) = ^k{oo) = uj{'w^^'^) > A. Hence, by Lemma 13.11 again, there 
exists tk G [0,oo) such that ^'_R-(r/^) = A. Thus the probability distribution 
ni^) = {Trf>) defined by 

has expectation A. Since this distribution has finite support it has radius 
of convergence px = oo; furthermore, m/n — )• A by assumption. Hence 
Lemma 113.31 applies to tt^^-* and yields 

Z(m,n;7r(^)) = e°('^). (13.15) 
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By (fniQ]) and (fTm]l . 

Z{m, n; tt^^)) = «>i^(TK)-"r^Z(m, n; w^). (13.16) 

Moreover, Z{m,n;-w) ^ Z{m,n;-w^^^) since Wfc ^ ^i^"* each k. Hence, 
by (fTTOD and (fmHD . 

Z(m,n;w) ^ Z(m, n; w^) = r-"$i^(Tif )"Z(m, n; tt^^)) 

= r^"<J>i^(T/^)"e°("). (13.17) 

This holds for every large fixed K. Now let — )• oo. 

If < t < p = 1, then ^>/^(t) ^ ^>(t) and $'^(t) ^>'(t) as ^ oo, 
so ^K{t) ^(t) < ^-(1) = ^ A. Hence, for large K, ^xit) < A = 
^k{tk), so tk > t. Consequently, limmiK^oo tk ^ 1- 

On the other hand, if t > /o = 1, let £ := [A] + 1 > A, and assume K i. 
Then 

= > = I - Sfi^ ^ , > A, (13.18) 

as K ^ oo, since <l'_ft"(t) — )■ <J>(i) = oo. Hence, for large K, ^_ft-(t) > A = 
^_ft'(''"i<'), and thus tk < t. Consequently, lim sup^_i.(,o rx ^ 1- 
Combining these upper and lower bounds, we have 

Tk ^ I, as K ^ oo. 

If we take t < 1, we thus have for large K, tk > t and hence ^xiTx) > 
$x(i)- Thus, liminfx-^oo ^k{tk) ^ limx->oo ^K{t) = ^{t) for every t < 1, 
so 

liminf <I>x('ri<') ^ <I'(1) = 1. 

K^oo 

Given any e > 0, we may thus take K so large that tk < and ^xiTx) > 
e-^ Then (fT3T7D yields 

Z{m, n- W) ^ ^-em-en+o{n) ^ ^-em-2en 

for large n. Since e is arbitrary and m = 0{n), this shows Z{m,n;w) ^ 
g-o(n)^ and the result follows since Z{m,n) ^ 1 for any probability weight 
sequence by (jlO.Sp . □ 



Proof of Theorem 10. 4\ First, Lemma 112.21 shows that r defined by (i) and 



(ii) is well-defined and equals t(A) defined in Lemma 112.21 since A < w we 



have r < oo and $(r) < oo. Further, (jl2.3p yields 

^{t) =mm{X,u). (13.19) 

Since r < oo and ^{t) < oo, nk is well-defined by (jl0.13p : furthermore, 
by Lemma [4. 21 and (113. 19p . (vT/t) is a probability distribution with mean and 
variance as asserted. 

We now turn to proving (110. 15p . the main assertion. We study three cases 
separately. 
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Case (a): r > 0. Then tt = (vTjt) is a probability weight sequence equivalent 
to w = (wk), so we may replace (wk) by (vTfc) without changing Bm,n- Note 
that this changes p and r to p{'k) = p{w)/t and r(7r) = t{-w)/t = 1 by 
(j4.4p and (j4.5p . We may thus assume that (tUfc) equals the probability weight 
sequence (tt^), and that /? ^ r = 1. By (113. 19j) . then ^(1) = min(A, ly). 
We employ the notation of Example 110.21 Note that by (|10.14j) . 

E^i = ^'(1) = min(A, i^) ^ A. (13.20) 

Moreover, if p > 1, then i/ = ^{p) > ^(1) by Lemma l3.1|, so (113. 19p shows 
that in this case, 

E^i = 1'(l) = A. (13.21) 

The allocation (,^i, . . . ,^„) (with a random sum Sn) consists of n i.i.d. 
components, so 

n 

Nk{Ci,...,^n) = Yl = ^} ~ Bi(n, vTfc) (13.22) 
1=1 

has a binomial distribution. For every k and e > 0, we have by Chernoff 's 
inequality, see e.g. (gBI . Theorem 2.1 or Remark 2.5], 

F(|iVfc(6,---,en) -nvTfcl >en) ^ exp(-c,n), (13.23) 

for some constant Cg > depending on e. 
We condition on Sn = m, recalling that 

Bm,n={iCl,---,Cn)\Sn = m). (13.24) 

When yO > 1 we apply Lemma Il3.3| using rajn ^ A and (|13.2ip . and 
when p = 1 we apply Lemma 113.41 using (jl3.20p . In both cases we obtain 
P(S'„ = m) = exp(o(n)) and thus by (fT3:24]l . 

P(|A^fc(-B™,„) - nvTfcl > en) = P(|iVfc(^i, mrk\ > en \ Sn = m) 

F(|iVfc(gi,...,^n) -nTTfcl > en) / ^ / n\ ^ n 

^ FiSn = m) ^ ^"H-^^" + ^(")J ^ 

Since e is arbitrary, this shows that 

Nk{Bmn) p „ 

TTfc — ;> 

n 

as asserted, which completes the proof when r > 0. 

Case (b): t = and p > 0. We write iVfe for Nk{Bm,n)- By (|10.13p we 
have vTo = 1 and vr^ = for A; > 0; hence, (|10.15p says that N^/n — t- 1 and 
Nk/n ^ for A: > 0. 



Since t < p, we are in case (i) so A = ^(r) = ^'(0) = 0. In other words, 
m/n — )• 0. The result is trivial (and deterministic) in this case. We have 

^ oo ^ oo 

-YNkS^-y"kNk = — ^X = 0. (13.25) 
n ^-^ n ^-^ n 

k=l k=l 
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Hence Nk/n — )• = vrfc for every k ^ 1. Moreover, (113. 25p also implies 
No _n- ZT=i Nk 



n n 
which completes the proof when r = < p. 



1 = ^0, (13.26) 



Before we treat the remaining case in Theorem 110. 4^ we show that The- 
orem [TOS] too holds in the cases treated so far. 



Proof of Theorem \1 0. 61 from Theorem 10. 4\ We prove that Theorem ll0.4l for 



some weight sequence {wk) implies Theorem 110.61 for the same weights. The 
assertions about r follow from Lemma 112.21 so we turn to ()10.17p . 

Consider a subsequence of (m(n),n). It suffices to show that every such 
subsequence has a subsubsequence such that (110. 17p holds. (See e.g. [H, 
Section 5.7], [H, p. 12] or Theorem 2.3] for this standard argument.) 

Since m/n ^ C hy assumption, we can select a subsubsequence such that 
m/n ^ X for some X ^ C < uj. Then Theorem 110.41 applies and thus (along 
the subsubsequence), 

^( = — ^ ^- (13.27) 

Furthermore, since m/n — )• A and x i— )■ t{x) is continuous, T{m/n) — )• t(A) 
(along the subsubsequence); hence 

Wk{T{m/n))'' _ Wfc(r(A))^ 
$(r(m/n)) ^(r(A)) ' ^ ' ' 

Combining (|13.27p and (|13.28p . we see that (|10.17p holds along the subsub- 
sequence, which as said above completes the proof of ()10.17p . 

That (|10.17p holds uniformly is, in fact, automatic since we have shown 
it for an arbitrary m{n) (although we stated it for emphasis): Let X^.n 
denote the left-hand side of (jl0.17p . and let e > 0. Choose m(n) as the 
integer m G [0,Cn] that maximises P(|Xm_„| > e). Since (|10.17p says that 
^{\Xm(n),n\ > s) 0, we have sup^^cn^{\Xm,n\ > e) 0. □ 



Completion of the proof of Theorem 10. 4\ Case (c): p = 0. We write again 



N}. for Nk{Bm,n): recalling that this is a random variable. In this case v = Q 
and T = p = for every A ^ 0. By (110. 13p we thus have ttq = 1 and vr^ = 
for /c > 0; hence, as in case (b), we have to show that Nq/u -^-^ 1 and 
Nk/n for A: > 0. By assumption, m/n converges, so the sequence m/n 
is bounded; let C be a large constant such that m/n ^ C. Further, let K 
be a large integer; we assume K > 2C and (for simplicity) wk > 0. (Note 
that such K exist since cj = oo when p = 0.) 

We say that a box is small if it contains at most K balls, and large 
otherwise. Let A^' := Nk be the number of small boxes and M' := 
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kNk the number of balls in them. Note first that by our assumptions, 
m/n !^ C < K/2. Hence, 

oo oo 

m^m-M' =^kNk'^ K^Nk = K{n-N') ^ {n-N'). (13.29) 

K+l K+l 

Thus, n — N' ^ n/2 and A^' ^ n/2; in particular A^' — )■ oo. Moreover, 

M' m ^ , , 

The weight w{y) in ()10.2p factorizes as the product over the small boxes 
times the product over the large boxes. Thus, if we condition on M' and A^', 
and moreover on the set of the N' boxes that are small, then the allocations 
of the small boxes and the large boxes are independent; moreover, the allo- 
cations to the small boxes form a random allocation of the type Bm',n' for 
the truncated weight sequence w^''^) given by (jl2.4p above. By assumption, 
wk > 0, and thus the truncated sequence has a;W := w(w(^)) = K. 

The truncated weight sequence w^^^ has a polynomial generating function 
^^^\t) = 'Y^Q Wkt^ with an infinite radius of convergence p^^^ = oo. We 
have already proved Theorem 110.41 in this case, and thus Theorem 110.61 
also holds in this case, by the proof above. Applying Theorem 110.61 to 
the truncated weight sequence and the allocations of small boxes we see 
that there exists a continuous function tk '■ [0, K) — t- [0, oo) such that, 
conditioned on {M',N'), 

N' <^(K){tk{M'/N')) ^ ^ 

Moreover, (|13.31|) holds uniformly in all (M', N') by Theorem 110.61 and 
(jlS.SOp . Hence, denoting the left-hand side of (jl3.3ip by X, we have for 
every e > F{\X\ > e \ M' ,N') ^ 5(n), for some function 5{n) 0. Taking 
the expectation, it follows that also F'd-'^l > e) ^ 5{n) — )• 0, and thus (|13.3ip 
holds also unconditionally. Thus, 



Nk ^ Wk{TK{M'/N')f 
N' ^(k){tk{M'/N')) 



Op(l), ki^K. (13.32) 



By (fTM]) . M'/N' ^ 2C, and thus, using Lemma [H^] and 2C < K = 

a;(wW), tk{M'/N') ^ tk{2C) < oo. Hence, with Ci := tk{2C), 

wo ^ <f^''\niM'/N')) ^ <I>W(Ci) = C2, 
say. Taking /c = in (113. 32p we now find 
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Since N' ^ n/2 this shows that there exists C2 > (for example C2 := 
wo/{3C2)) such that w.h.p. 

^ ^ C2. (13.34) 

n 

It follows further from (|13.33|) that we can invert p3.32|) for k = (since 
X is continuous for x > 0); thus 

- = + Op(l). (13.35) 

Multiplying (jl3.32p and (|13.35p we find the simpler relation 

^ = ^{tk{M'/N'))'' + Op(l), A; 5^ i^. (13.36) 

Let £ := mm{k > : > 0} be the smallest non-zero index with positive 
weight, and define a random variable by 

r.:=(^Y\ (13.37) 

It follows from p3.36p . with k = i, that = tk{M' /N') + Op(l). Conse- 
quently, (113. 36p yields 

# = — r^ + Op(l), k^K. (13.38) 

We have so far worked with a fixed, large K. However, the definition 
()13.37p does not depend on the choice of K, and since K may be chosen 
arbitrarily large, we see that, in fact, ()13.38p holds for every /c ^ 0, with the 
same (random) r*. 

Fix again > 0, and sum (jl3.38p for k ^ K. This yields 

Recall that A^o/^ > C2 w.h.p. by (113.341) . We thus have from (113.391) 

^^^)(n) ^ t^oi^ + Op(l) ^ wo/c2 + 1 (13.40) 
iVo 

w.h.p. By assumption, p = 0, so $(t) = oo for every t > 0. Hence, for every 
e > we have ^^^\e) — )• $(e) = oo as -fC — t- oo, so we may choose K with 
<I>(^)(e) > 'u;o/c2 + 1. Then (|13.40p shows that n < e whp; since e > is 
arbitrary, this says that 

We substitute this in (|13.38p . and obtain Nk/No for every k ^ 1; 
hence also 

Nk/n -^0, k^l. (13.41) 
Finally, we return to ()13.29p . and see that 

K{n - N') ^mf^ Cn. (13.42) 
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Let e > and choose K > C/e; then (|13.42|) yields n — N' < en and thus 
iV' > (1 - e)n. Further, by (flklTl) . 



K 



No = N' -Y,Nk = N' - Op(n) > (1 - e)n - Op(n), 

1 

so w.h.p. A'^o > (1 — 2e)n. This shows that Nq/u 1, which together with 
(|13.4ip completes the proof in the case p = 0. □ 

This completes the proof of Theorem ll0.4l and thus also of Theorem ll0.61 

Proof of Theorem \10.7\ Conditioned on the numbers Nf^ = Nk{Bra,n)^ k = 
0, 1, . . . , the numbers Yi, . . . , Yn are obtain by placing A'^o O's, A^i I's, . . . , 
in (uniformly) random order; thus the conditional probability is 

nn = ,H A'o, A'., .. = n = n 

1=1 1=1 ' V / 

(13.43) 

where Cj := \{j < i : yj = yi}\. By Theorem 110.41 this product converges 
in probability to Y[i=i as n — )■ oo, and the result follows by taking the 
expectation (using dominated convergence). □ 

14. Trees and balls-in-boxes 

The proofs of the results for random trees are based on a connection 
with the balls-in-boxes model. This connection is well-known, see e.g. Otter 



[93] , Dwass [3a| , Kolchin [7a] , Pitman [99[] , but for completeness we give full 
proofs. 

We consider a fixed weight sequence w = {wk) and the corresponding 
random trees Tn and random allocations Bm,n] we write Bm,n = (Yi, . . . , Yn). 

We begin with some deterministic considerations. The idea is to regard 
the outdegrees of the nodes of a tree T as an allocation; we regard the nodes 
as both balls and boxes, and if u is a node, we put the children of v as balls 
in box V. There are two complications, which will be dealt with in detail 
below: we have to specify an ordering of the nodes and we will not obtain 
all allocations. 

Let T be a finite tree, with \T\ = n. Take the nodes in some prescribed 
order f i , . . . , t;„ , for definiteness we use the depth-first order (this is the 
lexicographic order on Voo), and list the outdegrees as di = (i"'"(fi ),..., d„ = 
d'^{'^n)- We call this the degree sequence of T and denote it by A(T) := 
(di, . . . , d„). Note that the tree T can be reconstructed from (cZi, . . . , d„), 
so T is determined by A(T) = (di, . . . , dn)- 

By (p:2|) . (iiH Vdn = n-1, so (di,. . . ,d n) can be seen as an allocation 

of n — 1 balls in n boxes: A(T) = (di, . . . , d„) E Bn-i,ri- Consequently, A is 
an injective map X„ — )• Bn-i^n- Note also that A preserves the weight: 

w{T)=w{k{T)) (14.1) 
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by the definitions ()2.3p and ()10.2p . However, not every allocation corre- 
sponds to a tree, so A is not onto. We begin by characterizing the image 
A(Tn). We use a simple and well-known extension of (12. 2p . 

Lemma 14.1. Let T be a tree and T' a subtree with the same root. Let 
dT' := {v E V{T) \ V{T') : v ^ w for some w G T'} be the set of nodes 
outside T' with a parent inside it. Then, 

^d-^{v) = \T'\ + \dT'\-l. (14.2) 

iieT' 

Proof. The set of children of the nodes in T' consists of (y{T') \ {o}) U 
dV. □ 

Lemma 14.2. A sequence {di, . . . , dn) € Ng is the degree sequence of a tree 
T if O'lT'd only if 

k 

^di^k, ls^k<n, (14.3) 

i=l 

n 

^di = n-l. (14.4) 

i=l 

Of course, (jl4.4p is just the requirement that (di, . . . , dn) G 13n~i,n- 

Proof. For any k ^ n, the nodes vi,...,Vk form a subtree of T, and 
Lemma 114.11 yields 

k 

Y,d:^{vi) = \dTk\+k-l, (14.5) 

which yields (jl4.3p since ^ 1 when k < n. 

Conversely, if (di,...,d„) satisfies (|14.3p - ()14.4p . a tree with degree se- 
quence (di, . . . , dn) is easily constructed. (The point is that (|14.3p assures 
that the construction will not stop before we have n nodes.) □ 

The amazing fact is that for any allocation in Bn-i,n, exactly one of 
its cyclic shifts satisfies (jl4.3p . (In particular, exactly 1/n of all allocations 
satisfy (jl4.3p .) To see this, it is simplest to consider the sequence {di — l)"_i ; 
we state amore general result that we will use later, see e.g. Takacs jlOSl ]. 
Wendel [loi]. Pitman 

Lemma 14.3. Let xi, . . . , x„ G {—1, 0, 1, ... } with xi + ■ ■ ■ + Xn = —r ^ 0. 

(?) (7) (7) 

For j G Z, let x^ , • • • , a^n be the cyclic shift defined by x] ■= xi+j with the 

(i) 

index taken modulo n, and consider the corresponding partial sums := 

Yli=i k = 0, . . . ,n. Then there are exactly r values of j G {1, . . . , n} 
such that 

Sl^^ > -r, ^ /c < n. (14.6) 
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Note that Sq^ = and Sn^ = —r for every i. The condition (jl4.6p thus 
says that the walk Sq\ . . . , Sn^ first reaches — r at time n. The case r = 
is trivial: since 5,5^^ = 0, inJM then is never satisfied for k = 0. 

Proof. We extend the definition of Xj for all j G Z by taking the index 
modulo n; thus xj^n = ^j- We further define Sk for all A; G Z by S'o = and 
Sk-Sk^i = Xk, k e Z; thus Sk = Yli=i when k^ and St = - Ylt 
when k < 0. Then Sk+n = Sk — r for all /c G Z, and S^^^ = Sk+j — Sj. 



k+l 



Let further 



Mk := min Si = min Si; 

— oo<i^k k—n<i^k 



note that Mk is finite and Mk+n = Mk — r. Moreover, Mk+i ^ Mk and 
Mfc+i — Mk is or —1, since S^+i = Sk + Xk+i ^ Sk — 1- We have 

^k^ > -r, for ^ A; < n <;=^ Sk+j - Sj > -r, for ^ /c < n 

<;=^ Sk+j + r > Sj, for ^ A; < n 
<;=^ Sk+j-n > Sj, for ^ A; < n 
<;=^ Si > Sj, for j — n ^ i < j 
^ M,^,>S, 
^ Mj-i > M,. 

In each interval of n integers, M decreases by r in steps of 1, so there are 
exactly r steps down, which completes the proof. □ 

Corollary 14.4. // {di, . . . ,dn) G Bn-i,n, then exactly one of the n cyclic 
shifts of {di, . . . , dn) is the degree sequence A(T) of a tree T G T„. 

Proof. Let := di — 1. Then Yli=i — Yli=i '^i ~ ^o p4.3p is equivalent 
to a;i ^ for A; < n, which for the shifted sequence is ()14.6p with r = 1; 
further, Yll=i Xi = n — 1 — n = —1. Hence the result follows by Lemma [14.31 
with r = 1. □ 

We now use our fixed weight sequence {wk). We begin with the partition 
function for simply generated trees. This was proved (in the probability 
weight sequence case, which is no real loss of generality) by Otter ^], see 
also Dwass (sgI ]: an algeb raic proo f uses the Lagrange inversion formula 



[79], see e.g. Boyd [19| and iDrmotal [33, Theorem 2.11]; Kolchin [76] gives a 
different proof by induction. See also Pitman [99,] where the relation between 
different approaches is discussed. 

Theorem 14.5. 

Zn = -Z{n - l,n). 
n 

Proof. By Corollary 114.41 the mapping (T, j) i— )• A.{T)^^\ where '•■'^ denotes a 
cyclic shift as in Lemma fl 4. 3 1 is a bijection of x {1, . . . , n} — )• Bn-i,n- Con- 
sequently, by (|10.4p . (114. ip and (12.51) . since the weight w{y) is not changed 
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by cyclic shifts, 



Z{n-l,n)= T.'^iHT) 



UP 



Tex„ j=i 



Tex„ 



nw{A{T)) 



nw{T) = nZ„ 



□ 



Corollary 14.6. Suppose that wq > and ljj{w) ^ 2, with d := span(w) 
^ 1. If Zn > 0, then n = 1 (mod d). Conversely, for some uq (depending 
on w), if n = 1 (mod d) and n ^ uq, then Zn > 0. 

Proof. By Theorem 114.51 Z„ > <;=^ Z(n — l,n) > 0. The result follows 
from Lemma 112.31 □ 



In the same way we can compute various probabilities for the random 
tree Tn- We begin with the root degree d+(o); note that for any tree T, vi 
is the root o, so d^{o) = d'^{vi) = di. (Lemma 114.71 is a special case of 
Lemma 114.91 below, but we prefer to study this simpler case first because 
it shows the main ideas in the proof without the complications (notational 
and others) in the more general version.) 

Lemma 14.7. For any d ^ and n ^ 2, 

P(d+ (o) = d) = -dF{Y, = d). (14.7) 

n — 1 

Thus, the distribution of the root degree dj- (o) of Tn is the size-biased dis- 
tribution of Yi . 

Proof. Let T £ Tn have degree sequence (di, . . . , d„). If di = d, then 
d2, ■■■ ,dn is an allocation in Bn~i-d,n~ii a-nd by Lemma 114.21 such an allo- 
cation (^2, . . . ,dn) comes from a tree T with di = d if and only if 

k 

d + ^di^k, li^k<n, (14.8) 

i=2 



or, equivalently, 

A; 



dj+i k + l- d, ^ k < n - 1. 



i=l 



We use Lemma 114.31 again, now with Xi = dj+i — 1 and r = d and see that 
exactly r = d of the n — 1 cyclic shifts of d2, dn satisfy (jl4.8p . Thus, by 
considering all trees T with di = d and the n — 1 cyclic shifts of d2, ... ,dn, 
we obtain each allocation (di, . . . ,dn) G ^n-i,n with di = d exactly r = d 
times. (It is possible that some shifts of {d2, . . . , dn) coincide, but this does 
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not matter.) Consequently, 

(n-l)Z„P(d+ (o) = <i) = (n-l) ^(^) 

= d ^ w{{di, . . . ,d„)) 

(dl,...,ii„)ei3„-i,„: di=d 

= dZ{n-l,n)F{Yi = d). 



This yields the result by Theorem 114.51 □ 
Remark 14.8. More explicitly we have 

Z(n-l,n)P(yi = d) = w{{di,...,dn)) 

(di,...,d„)eBn-l,n: di=d 
= Y Wdw{{d2,. ■ ■ ,dn)) 

{d2,---,dn)&Bn-l-d,n-\ 

= WdZ{n — 1 — d, n — 1), 

and thus 

™/ ,^ , ^ N n Z(n — l — d,n—l) 

P(4.(»)=<i)=<i„.— .-4^— (14.9) 



Proof of Theorem 7.10\ By Theorem 110.71 (with m = n — 1 and A = 1) 



F{Yi = d) ^ TTd, and (fTTQ]) fohows from Lemma [1X71 

The space No is compact, so every sequence of random variables in it is 
tight, and therefore has a subsequence converging in distribution, see 

Section 6]. It follows from (I7.9p that if dj- (o) X along a subsequence, 
then V{X = k) = kiTk for every k £ Nq, and thus F{X = oo) = 1 - 

YlT=o ^'^k = 1—/^- Consequently, X = ^ so d^^{o) ^ for every convergent 



subsequence, which means that the entire sequence converges to ^, see jl5l . 



Theorem 2.3]. □ 

This proves the part of Theorem 17.11 that describes the root degree. It 
remains to consider all other nodes. This will be done by similar arguments. 
We begin with a generalization of Lemma 114.71 

Lemma 14.9. Let T' £ 'Zf be a fixed finite subtree of the Ulam-Harris tree 
Uoo, 1'Gt i := \T'\ be its size and let vi, . . . ,Vi be its nodes in depth- first order, 
and let d[,...,d'^ be its degree sequence. (I.e., d[ = d^,{vi).) Suppose that 
di, . . . ,d£ G No and that di ^ d'^ for every i . Then, for every n > i, 

¥{d+Jvi) = di fori = !,...,£) 

= fedi-^+l)^F(i^i = 'ii/ori = l,...^). (14.10) 
Note that d^{vi) ^ d'^ for i = implies that T D T'. 
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Proof. We have earlier used the depth-first order of the nodes to define the 
degree sequence, but many other orders could be used. In this proof, we 
consider only trees T that contain the given T' as a subtree, and then we 
choose the order which first takes the nodes of T' in depth-first order (this 
is vi, . . . ,V£), and then the remaining nodes of T in depth-first order; let 
A'(T) be the degree sequence in this order. 

Let An be the set of trees T G with {vi ) = di for all i (which implies 
T D T'). If T E An, then the degree sequence A'(r) thus begins with the 
given di,. . . ,di; furthermore, it satifies (jl4.3p . Conversely, every sequence 
beginning with the given di, . . . ,d£ that satifies (jl4.3p is the degree sequence 
A'(T) of a unique tree in An- Note also that (jl4.3p is automatically satisfied 
for k < £, since then di ^ d[ for i ^ A; and Yli=i d'i ^ k hy Lemma 114.21 
applied to T' . 

Let D := di+- ■ ■+de. Consider a sequence (di, . . . , dn) G Bn-i,n beginning 
with the given di, . . . ,di, and let Xi := d^+j — 1, for i = 1, . . . ,n — i. Then 
{di, . . . ,dn) satisfies (114. Sp if and only if 

k 

D + ^{xi + l) ^e + k 
1=1 

for k = 0, . . . ,n — i — 1, which is equivalent to 

k 

^Xi^-{D-e), Oi^k<n-e. 

i=l 

Furthermore, 

n—£ n 

^Xi = ^di - {n - I) = {n - I - D) - {n - I) = -{D - £ + 1). 

i=l £+1 

Lemma 114.31 with r = D — I + 1 thus shows that of the n — i cyclic permu- 
tations of (i^+i, . . . ,dn, exactly D — i + 1 yield a degree sequence A'(T) of a 
tree T G An- In other words, if we take the degree sequences A'(r) for all 
trees T G An and make these n — l permutations of each of them, then we 
obtain every allocation y = (yi, . . . , y„) € 13n-i,n with yi = di, i = 1, . . . 
exactly D — i + I times each. Consequently, 

(n - e)Zn nTn G An) = {n - £) w{T) = (n - £)w{A'{T)) 
^ {D-e+l)w{y) 

yeB„-i,„: yi=di for i^i 

= {D-i + l)Z{n - 1, n) F{Yi = di for i ^ i). 



The result follows by Theorem 114.51 



□ 
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Remark 14.10. Arguing as in Remark 114.81 we obtain from Lemma 114.91 
the explicit formula, generalizing (jl4.9p . with D := X^^^^ di and other nota- 
tions as above, 

r{d^^{vi) = di for i = !,...,£) 

Remark 14.11. Note that Lemma [14.91 (or ()14.1ip ) shows that the proba- 
bility remains exactly the same if we permute di, . . . ,d£, provided that the 
permuted sequence (^^(i)) still is allowed, i.e., (ia-(i) ^ d[ for all i ^ i. How- 
ever, if the latter condition fails for some i, then the probability typically 
becomes 0. (This is an interesting case of a symmetry that is not complete.) 
For example, considering only the root o and its first child 1, we have 

P((i+^(o) = d and d:^Jl) = d') = P(dr„(o) = ^' and d+^{l) = d) 

whenever d,d' ^ 1; however, if, say, d^ 1 and d' = 0, then the right-hand 
side is while the left-hand side in general is not. 

Remark 14.12. Lemma 114.91 extends with minor modifications (mainly 
notational) to arbitrary finite rooted subtrees T' of Uoo (not necessarily 
satisfying (16. ip ). We omit the details. 



15. Proof of Theorem 17.11 
First, as in the proof of Theorem 110.41 Lemma ll2.2l shows that r defined 



by (i) and (ii) is well-defined and equals r(l) defined in Lemma 112.21 since 



1 < 2 ^ we have r < oo and ^*(t) < oo. Further, (|12.3p yields ^'(t) = 
min(l,z^). Hence, by Lemma 14.21 (vr^) is a probability distribution with 
mean and variance as asserted. (This is a special case of the corresponding 
claims in Theorem 110.41 with A = 1. We have A = 1 here since we relate the 
random trees to allocations with m = n — 1, and thus m/n ^ 1.) 
The final claims follow by (|7.2p and the construction in Section [5j 

We turn to the main assertion, Tn T- Since T is a compact metric 
space, any sequence of random trees in % is tight, and has thus a convergent 
subsequence. (See e.g. [H, Section 6].) In particular, this holds for Tn- 

Consider a limiting random tree 7" in 1 such that Tn T along some 
subsequence. We will show that then T = T, regardless of the subsequence; 
this implies Tn — > T for the full sequence, which then completes the proof. 

We have defined T in Section [6] such that T C No°° using the embedding 
T I—)- (dj(f ))t)eKx3 • In order to show T = T, it thus suffices to show that the 
distributions agree on cylinder sets, i.e., that {d^{vi), . . . , d'^{vm)) G has 
the same distribution for T and T, for any finite set V = {vi, . . . , Vm} C Voo- 
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Since Nq is a countable set, this is equivalent to 

(15.1) 

for any finite set V = {vi, . . . , Vm} C Vex, and any di, . . . ,dm G Nq. 

It thus suffices to show (|15.ip . Furthermore, given any finite set V C Vqo, 
we may enlarge it to a finite set V satisfying (j6.2p - (j6.4p . i.e., a set that is 
the node set of some finite tree in Tf. It thus suffices to show (jlS.ip for 

V = V{T') with T' £ If. 

We make one more reduction. Suppose that V = V{T') with T' £ Tf 
and that (|15.ip contains a condition d~^{vi) = dt with di < d^,{vi). Let 

V := Vi and let u be the last child of v in T'; thus (recalling the notation in 
Section [H) u = vj for some integer j = d^,{v) > di. By (j6.5p . any tree T G T 
with d^{v) = di has d^(u) = 0, and further (e.g. by (j6.5p and induction) 
4(s) = for every descendant s of u. Thus, letting denote the subtree 
of T' rooted at u, for any s G T^, the event {d^{v) = di and di(s) > 0} is 
impossible and has probability 0; furthermore, the same holds for each Tn, 
and thus, since Tn ^ T along a subsequence, P((ii(w) = di and d^{s) > 
O) = 0. Consequently, if (jl5.ip contains a condition d'^{vj) = dj with 
Vj G and dj > 0, then both sides are trivially 0. On the other hand, if 
dj = for all Vj £ T^, then the conditions d'^{vj) = dj are redundant in 
(|15.ip and may be deleted, so we may replace T' by the smaller tree with 

removed. Repeating this pruning, if necessary, we see that it suffices to 
show (115. ip for V = V{T') when T' S Tf is a finite tree and di ^ d^,{vi) for 
every i. 

Recall that di in (jlS.ip may be infinite. We study three different cases 
separately. 

Case (a): Every dj < oo. This is the case treated in Lemma Il4.9| we take 
the limit as n — t- oo in ()14.10p and obtain by Theorem 110.71 (with m = n — 1 
and A = 1 < cj(w)), letting again D := J2i=i 

I 

¥{d+^{vi)=di for i = l,...,£) ^ (Z?-£+l) JJvTrf^. 

Since we have assumed Tn T along a subsequence, this yields 

¥{d+{vi) = diioxi = l,...,l) = (D-l + l)\\T^d^. (15.2) 

i=l 

Now consider the modified Galton-Watson tree T- (Recall its construc- 
tion in Section [5l) If the tree T has d'^{vi) = < oo for all Vi E T', then 
the spine has to extend outside T' . The first point on the spine outside T' 
is a node in dT' (regarding T' as a subtree of T). The condition (ii(uj) = di 
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for Vi G T' determines the boundary dT' of T' in T, which thus not depend 
on T, and Lemma 114.11 shows that | dT' \ = D — I + 1. 

Fix a node u E dT' , and consider the event £u that the spine of 7" passes 
through u and that d^{vi) = di iov i = 1, . . . The event £u thus specifies 

the nodes in T' that are special in the construction of T (viz. the nodes on the 
path from o to u), and for each special node it specifies which of its children 
that will be special; furthermore it specifies the number of children for each 
node in T', special or not. Recall that the probability that a special node 
has d < oo children, with a given one of them being special, is tt^, just as 
the probability that a normal node has d children. Thus, by independence, 
for every u G dT', F{£u) = Y[i=i '^di- This probability thus does not depend 
on u, so summing over the D — i + 1 nodes u G dT' we obtain 

e 

¥{d+{vi) = difoTi = i,...,e) = ^i£u) = {D~e + 1) n^d.' 

u&dT' i=l 

which together with (jl5.2p shows (jlS.ip in this case. (Cf. Remark 15.71 for a 
similar argument.) 

Case (h): Exactly one di = oo. Suppose that dj = oo and di < oo for i ^ j. 
Define, for ^ ^ oo, 

Ak := {T € T : (ij(fi) = di for i j and d'^{vj) = k}. 
We thus want to show P(T G ^oo) = F(T G ^oo)- We define further 

A^K '■= [J Ak, 

and note that since Tn T (along a subsequence), we have (along the 
subsequence), for any finite K, 

F{TneA^K)^ HTeA^K)- (15.3) 
We define also (for finite k) the analogous 

Sk ■= {{yi, - ■ ■ , Vn) e Bn-i,n ■ Vj = k and yi = for i ^ ^ with i / j}. 
Then Lemma 114.91 can be written, with D' := ^ < 

Tl 

G Ak) = {k + D'-i + 1) P(5„_i,„ G Bk). (15.4) 

n — i 

Consider, for simplicity, k > maxj^jdj. Then p3.43p shows that, with 

Ni = Ni{Bn-l,n), 

F{Bn-i,n eBk)=E P(i?„_i,„ G I iVo, iVi, • • • ) = E f — n ^"'^nnl^ ) ■ 
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(The implicit constants in the O in this proof may depend on (. and di, . . . ,di, 
but not on n or /c.) Consequently, by (I15.4p . 

¥{Tn G Ak) = {k + 0(1)) ((1 + O(n-i)) E n 

= (l + 0(0 + 0(n-))E(^n^)- 

Summing over k K, we obtain for any K, using YlV=o ^-^k = n — 1 for 
any allocation Bn-i^n-, 

oo 

F(r„ G a^k) = ^^^^ ^ -^k) 

k=K 

= (1 + o(K-') + o(„-')) E(5ta££!t n ^) ■ 

= (1 + 0(A'-) + 0(n-)) ^(^IZlZ^MldiEl n ^) , 



(15.5) 



By Theorem 110.41 for any fixed K , as n ^ oo, 



77, -^-^ n 

t^j k<K i^j 

By dominated convergence, the expectation converges to the same limit, and 
thus (fTOIl and (fKB yield 

P(r G ^^i^) = (1 + O(ir-i)) (l - J] ^VTfc) JJvrrf^. (15.6) 
Finally, let — )■ cxd to obtain 

P(rG^oo)= (l- 5^ fcvTfc) JjTTrf^ =(l-/7)J]7rrf,. (15.7) 

fc<oo j^j i^^j 

Now consider T. If di{vj) = dj = oo, then the spine ends with an 
explosion at Vj. This fixes the spine, and the event that di(fj) = di for 
i ^ j then means, just as in case (a) when we considered a specific iS^, that 
we have specified the number of children to be di for these nodes, and for 
the special nodes (except vj) we have also specified which child is special. 
The probability of this is vr^- for each i ^ j, and the probability that the 
special node vj has an infinite number of children is, by (j5.2p . 1 — fi. Hence, 
by independence, 

p(fG^oo) = (i-/")n^'^.' (15-8) 
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which together with j^TKl} shows P(T e Aoo) = F(T G ^oo), which is (fTETTl 
in this case. 

Case (c): More than one di = oo. By the definition of the modified 
Galton- Watson tree 7~, there is at most one node with infinite degree, so in 
this case, 

¥{d±{vi) = di lor i = l,...,i) =0. 

This means that the sum of these probabihties for all sequences {di, . . . , d„) 
with at most one infinite value is 1. But we have shown that for such 
sequences, the probability is the same for T as for T, so the probabilities 
for T for these sequences also sum up to 1. Consequently, if more than one 
di = oo, then 

F{d+{vi) = d, for i = 1,...,^) =0 

too, which shows (llS.lh in this case. 

This shows that (115. Ih holds for any vi, . . . ,Vm such that {vi, . . . , Vm} = 
V{T') where T' G Tf is a finite tree and (di, . . . , dn) is any sequence in Ng 
with di ^ d^,{vi) for every i. As discussed above, this implies (jlS.ip in full 

generality and thus T = T, which shows that Tn T- □ 

16. Proofs of Theorems 17.111 and 17.121 

We begin by stating another version of the correspondence between simply 
generated trees and the balls-in-boxes model. 

Lemma 16.1. We may couple Tn and i?n-i,n such that the degree sequence 
A{Tn) is a cyclic shift of Bn-i,n, and, conversely, Bn-i^n is a uniformly 
random cyclic shift of A{Tn)- 

Proof Let Bn-i,n = {Yi,...,Yn) and let (y;(i), . . . , y;(„)) be the unique 
cyclic shift of (Yi, . . . ,y„) that is the degree sequence of a tree in T„, see 

Corollary 114.41 Then (yo-(i), . . . , 5^o-(n)) = A{Tn), as a consequence of Corol- 
lary [T33] and the invariance of the weight w(Yi, . . . , Y^) under cyclic shifts. 
Consequently, we may couple Bn-i,n and Tn such that (^^(i)) • • • > ^o-(n)) = 
A(Tn), and the result follows. □ 

Proof of Theorem \7.11\ We use the coupling in Lemma 116. 1[ Then in 
Theorem [TH] equals Nd{Bn-i.n) in Theorem [1031 and thus ([7T2]) follows 
by (fTaT5D . 

We obtain (j7.1ip as a simple consequence of (j7.12p . using P((i^ [v) = 
d\ Nd) = Nd/n and thus P(d:f^(v) = d) = EA^^/n, cf. the proof of" Theo- 
rem (THiZl Alternatively, we can arrange so that dj-^{v) = Yi, and the result 
then follows by Theorem 110.71 □ 



Proof of Theorem\7.12[ We use again the coupling in Lemma [16. 11 Let T be 



a fixed tree of size £ and let its degree sequence be (di, . . . , d^). Recall that 
we have defined the degree sequence using depth-first search. It follows that 
if a tree has degree sequence (di, . . . , dn) and a node v is visited as node Vj 
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in the depth-first search, then the subtree rooted at v has degree sequence 
{dj, . . . ,dk), where we stop when this is a degree sequence of a tree, i.e., 
when it satisfies the condition in Lemma 114. 2[ In particular, the subtree 
rooted at v equals T if and only if {dj, . . . , dj+i-i) = (di, . . . , di). (Clearly, 
this is impossible if j > n — £ + l, since then a tree would be completed with 
less size than i.) 

Consequently, Nt equals the number of substrings (di, . . . , di) in (Yi, . . . , 
regarded as a cyclic sequence. In other words, if we let Ij be the indicator 
of the event (Yj, . . . , Yj^i^i) = {di, . . . , dg), where we define Yi := Fj-n for 
i > n, then 

n 

NT = ^Ij. (16.1) 

In particular, taking the expectation and using the rotational symmetry, 

P(r„;. = T) = ^ENT = Eh= F{{Yi, . . . , y,) = (Ji, . . . , de)) , 
and thus Theorem 110.71 yields 

i=l 

which proves ()7.13p . 

In order to show the stronger result (j7.14p . we condition as in the proof 
of Theorem 110.71 on Nq, Ni . . . and obtain, see ()13.43p . 

E{Ij \No,Ni,...)= F{{Yi, ...,Ye) = {di,..., dg) \No,Ni,...) 

N,-^ -c^ -A- A^d- „ r 1 
1=1 '" " ' " 1=1 



where c, := \{j < i : dj = di}\. If \j — k\ £ and \ j — k ±n\ ^ I (i.e., j and 
k have distance at least I, regarded as point on a circle of length n), then 
similarly, with c[ :=\{j ^ I : dj = di}\, 

J-r Ni — Ci J-r Ni — a — c' 

E(;,4 I A.,, ^ ^ = n n ^TViTT- 

i=l i=l 

and it follows that 

Cov(/„4. I 7Vo,iVi,...) = 0(l/n). (16.3) 
For j and k of distance less than we use the trivial 

|Cov(/,-,4 I iVo,A^i,---)l ^ 1- (16.4) 

There are less than pairs {j,k) of the first type and 0{n) pairs of the 
second type, and thus by (jl6.ip and ()16.3p - (|16.4p . 

71 71 

Var(A^T I A^o, iVi, . . . ) = 5Z E Cov(/„ 4 \No,Ni,...) = 0{n). 

3=1 k=l 
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Consequently, Nt/u -W.{NT/n \ Nq,Ni,...) 0, and thus by ([TO]) . 
([TO]) and Theorem [1031 



n V n 



ivo,ivi,...) +op(i) =n— +'^p(i) 

1=1 

^n^4 = nr = T). □ 

4=1 

17. ASYMPTOTICS OF THE PARTITION FUNCTIONS 

We have a simple asymptotic result for the partition function Z(m, n) (to 
the first order in the exponent, at least if p > 0): 

Theorem 17.1. Let w = {wk)k^o be a weight sequence with wq > and 
Wk > for some k ^ 1. Suppose that n — )• oo and m = m{n) with span(w) | 
m, m — )• oo and m/n — )• A where ^ X < uj, and let r he as in Theorem \10.4\ 

(i) If p> 0, then 

-log Z(m,n) log<I>(r) - Alogr G (-oo,cx)). (17-1) 
n 

(ii) If p = and A > 0, then 

-IogZ(m,n) ^ oo. (17.2) 
n 

In both cases, the result can be written 

— log Z(m, n) — log inf — ^ = log inf — ^ ^ oo. (17.3) 

n oa^p t^ ^o«;t<oo ^ ^ 

If ^ A ^ and p > 0, the limit can also be written log <l>(r) — ^'(r) log r. 
The formula p7.ip is shown by a physicists' proof by Bialas, Burda and 
Johnston [14]. 

Remark 17.2. If A = 0, then r = 0, and we interpret the right-hand side 
of (|17.1|) as log<I>(0) = logifo; this is in accordance with (|17.3|) . 

It is easily seen that the result holds, with this limit, also in the rather 
trivial case when m is bounded, provided Z{m, n) > 0. 

Remark 17.3. If a; < oo, then the result holds also when A = provided 
Z(m,n) > 0, if we let r = oo as in Remark IIU.IUI and interpret the right- 
hand side of (|17.1|) as the limit value log Wi^, which again is in accordance 
with (jl7.3p . This follows from Remark 117.21 bv the symmetry argument in 
Remark [mni 

Remark 17.4. Using the function t(x) defined in Theorem I1U.6[ the re- 
sult ()17.ip can also be written, using the continuity of t(x) and an extra 
argument (which we omit) when A = 0, 

log Z{m, n) = n log $(t(x)) — m log t{x) + o{n) (17.4) 
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or, equivalently, 

Z{m,n) = $(r(x))"T(x)-™e°("). (17.5) 
As in Theorem 110.61 it suffices here that m/n ^ C < oo (and m — t- oo). 

Proof of Theorem 17.1 Note that the assumptions imply that Z{m,n) > 
(at least for n, and thus m, large) by Lemma [l2.3[ The equivalence between 
(fT711) - (fT7:2D and ([173]) follows from (fT0T6D . 

(i): Assume first A > 0. Since p > and A > 0, we then have r > 0. 
Thus w = (wk) is equivalent to tt = (vTfc), and Lemma [10.31 vields 

Z{m,n) = Z{m,n;w) = <I)(r)"r~'^Z(m, n; tt). 

We saw in the proof of Theorem 110.41 case (a), that Lemmas 113.31 and 113.41 
yield Z{m,n;n) = exp(o(n)), and thus 

Z{m,n) = exp(nlog <I>(r) — mlogr + o(n)), 

which yields (117. ip . 

It remains to consider the case A = 0. Then m/n — t- 0, and we may 
assume m < n/2. In any allocation of m balls, there are at most m non- 
empty boxes. Let us mark 2m boxes, including all non-empty boxes. For 
each choice of the marked boxes, we have in them an allocation in Bm,2m, 
and only empty boxes outside; since there are (2^) choices of marked boxes. 



Z{m, n) ^ ( ^'^^ ) <-2'"Z(m, 2m). (17.6) 



n 

2m 

On the other hand, any allocation of m balls in 2m boxes can be extended 
to an allocation in Bm,n with the last n — 2m boxes empty; thus 

Z{m, n) ^ w^-^"^Z{m, 2m). (17.7) 

We have, by Stirling's formula, using m/n — J- A = 0, 

1, ( n\ 1, /en\2m 2m , e 2m , m 
— Inp- I 1 = 



-log U-log — =— fog- log-^0. (17.8) 

n \lm) n \lmJ n 2 n n 

Moreover, by the case A > just proved, we have from (]17.ip log Z(m, 2m) = 
0{m) = o{n). Consequently, ^Mf-^M yield 

log Z(m, n) = {n — 2m) log wq + o(n) = n log wq + o(n) , 

showing (117. ip in the case A = 0. 

(ii): As in the proof of Lemma [1331 we use the truncated weight sequence 
w^^) defined in (113. 13p . where K is so large that span(w(^)) = span(w) and 
(w(^^)) > A, and we let again and be the corresponding functions 
for w^^) and define tk by '^k{'^k) = A. 

For any t > 0, ^K{t) ^{t) = 00 as 00, and thus (113. 18p holds, 

showing that for large K, ^ xii) > A and thus tk < t. Since t is arbitrary. 
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this shows that r^- — )• as X — )• oo. Applying (i) to w*-^-* and its partition 
function Zk we obtain, for every large 

liminf — Z(m, n) ^ lim — n) = log <I>i^(ri^) — A logr^ 

^ log Wo - AlogTA". 

As — )• oo, rj^ — )• SO the right-hand side tends to oo, which completes the 
proof. □ 

Remark 17.5. The case p = and A = is excluded from Theorem I17.lt 
in this case, almost anything can happen. To see this, note first that by 
(pTGjl - fpTS]) . if m/n ^ A = 0, 

— log Z{m, n) = log wq -\ — log Z(m, 2m) + o(l), (17.9) 
n n 

and by Theorem I17.1( ii). log Z {111,2771) — ?• 00 as m — )• 00, and hence 

m/Z{m,2m) — t- 0. We can choose m = m{n) — )■ 00 with m/n — )■ so 

rapidly that m/n <^ m/ log Z(m, 2m), and then MogZ(m, 2m) — )• and 

inj^ yields ^logZ( m, n) — )• logt^o = log <1'(0). 

We can also choose m with m/n — )• so slowly that m/n ^ m/log Z(m, 2m), 
and then ^ log Z(m, 2m) — )■ cxd and (I17.9P yields ^ log Z(m, n) — )• 00. 

Furthermore, we can choose m(n) oscillating between these two cases, and 
then liminf - log Z{m, n) = log <I?(0) and lim sup - log Z(m, n) = 00, and we 
can arrange so that every number in [log $(0), 00) is a limit point of some 
subsequence. 

For many weight sequences with /? = 0, one can choose m(n) such that 
^ log Z(m, n) — )• o for any given a E [log <^(0), 00]. For example for Wk = k\ 
as in Example 19.81 we have by [gJI and Theorem 1 1 4 . 5 1 Z (n — 1, n) ~ en! and 
it follows, arguing similarly to (jl7.6p and (|17.7p . that ^logZ(m, 2m) = 
logm + 0(1), so taking m ~ an/ log n, we obtain - log Z(m,n) — )• a by 

However, if Wk increases very rapidly, it may be impossible to obtain 
convergence of the full sequence to a limit different from log$(0) or 00, so 
we can only achieve convergence of subsequences. For example, if tuq = 1 and 
Wk+i ^ Z(A;,2A:)2, then Z(/c + l,2(/c + l)) ^ Wfc+i Z{k,2kf , and it follows 
easily from (117. Dh that lim sup ^ log Z(m, n) ^ 2 lim inf ^ log Z(m, n). 

We apply Theorem 117.11 to simply generated trees. 

Theorem 17.6. Let w = {wk)k^o be any weight sequence with wq > and 
Wk > for some k ^ 2. Suppose that n — t- 00 with n = 1 (mod span(w)), 
and let r be as in Theorem\7 . 1\ Then 



— log Zn — > log <I>(t) — log r = log inf G (—00, 00]. 

n o^t<oo t 

The limit is finite if p > 0, and +00 if p = 0. 

Proof. An immediate consequence of Theorems 114.51 and 117.11 □ 
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For probability weight sequences, Theorem 117.61 can be expressed as fol- 
lows, cf. Remark 17.91 

Theorem 17.7. Let T he a Galton-Watson tree with offspring distribution 
^, and assume that = 0) > and F{S, > 1) > 0. Suppose that n — )• oo 
with n = 1 (mod span(,^)), and let r be as in Theorem\7.1\ Then 



-logPfiri =n) ^log^fr) -logT = log inf ^^g(-oo,0]. 

n o^t<oo t 

// = 1, or if < 1 and p = 1, then the limit is 0; otherwise it is 
strictly negative. In other words, IPd?"! = n) decays exponentially fast in the 
supercritial case (then t < 1) and in the subcritical case with p > 1 (then 
T > 1), but only subexponentially in the critical case and in the subcritical 
case with p = 1 (then t = 1). 

Proof. We have Pd?"] = n) = Z^, see Section [21 and we apply Theo- 
rem [TTSl Since now (wk) is a probability weight sequence, we have p ^ 1 
and miQ<Qt<oo ^{t)/t ^ <I>(1)/1 = 1, with equality if and only if r = 1, 
see Remark 17.41 The final claims follow using the definition of r in Theo- 
rem [TTl □ 

When p > and A > (which are equivalent to r > 0), we can also 
prove stronger "local" versions of Theorems 117.11 and 117.61 showing that the 
partition function behaves smoothly for small changes in m or n. 

Theorem 17.8. Let w = {wk)k^o be a weight sequence with wq > and 
Wk > for some k ^ 1. Suppose that n — )• oo and m = m[n) with m/n ^ X 
where < X < uj, and let t be as in Theorem 10. 4\ If p > 0, then, for every 
fixed k such that span(w) | k, 

Z{m + k,n) __f^ 



Z{m, n) 

Proof. For any /c ^ 0, by (UnSD^dinSI), 



r-^ (17.10) 



™/ ,N wuZim — k,n — 1) 

= k) = \ (17.11) 



and thus 

P(Y'i = k) WkZ{m — k,n 



F(Yi = 0) woZ{m,n-l) 
Since Theorem 110.71 vields 

^Yi = k) ^ VTfc ^ ^j^Wk 

P(yi = 0) vro ^ wo' 

we see (replacing n by n -|- 1) that (|17.10p holds when —k G supp(w). Fur- 
thermore, the set of G Z such that (|17.10p holds for any allowed sequence 
m{n) is easily seen to be a subgroup of Z (since we may replace m by m it A;' 
for any fixed k'). Consequently, by ()3.3p . this set contains every multiple of 
span(w). □ 
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Theorem 17.9. Let w = {wkjk^o be a weight sequence with wq > and 
Wk > for some k ^ 1. Suppose that n — )• oo and m = m{n) with m/n ^ X 
where ^ X < oj, and let r be as in Theorem 10-41 Then, 

- fW. (17.12) 

Z[m, n) 

Proof. By (|17.11l) with A; = and Theorem [1071 



and the result fohows since wq ^ □ 
For trees we have a corresponding result: 

Theorem 17.10. Let w = {wkjk^o be a weight sequence with wq > Q and 
Wk> ^ for some A; ^ 2. If p > and span(w) = 1, then 

Zn+l ^ £(t) 
Zn T 

Proof By Theorems [IM] and dZ^HHil 

Zn+i nZ(n,n + l) n Z(n,n + 1) Z(n,n) ^. . 

— = = • — ^- ■ — 7- ^{t)t . 

Zn {n — l)Z{n — l,n) n — 1 Z{n,n) Z(n — l,n) 

□ 

We assumed here span 1 for convenience only; if span(w) = d, we instead 
obtain, by a similar argument, Zn+d/Zn (''")/''")'' • 

In the case u ^ 1 and cr^ = T^'(r) < oo (which is automatic if > 1), 
i.e. our case Iq, Theorem 1 1 7 . 6 1 c an be sharpened substantially as follows, see 
Otter (ai], Meir and Moon [si], Kolchin [tS], Drmota |33|]. 



Theorem 17.11. Let w = {wk), r and a"^ be as in Theorem \ and let 
d := span(w). If v ^ 1 and o"^ < oo, then, for n = 1 (mod d). 



Zn-^^- =d\l:-^[^] n-^l\ (17.13) 



d ^(jYt^-'' _ \ $(t) /^$(t)^" 

Proof. Replacing (w\^j by (vrfc) and using ()4.3p . we see that it suffices to 
consider the case of a probability weight sequence with r = <I'(t) = 1. By 
Theorem 114.51 (|10-5p and ()8.ip , in this case the result is equivalent to 

P(5n = n - r 



\/27ra2 



n 



which is the local central limit theorem in this case, see e.g. iKolchinI |76l . 



Theorem 1.4.2] or use Lemma 113.1 1 and Remark 1 13. 2 [ □ 
There is a corresponding improvement of Theorem 117.11 
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Theorem 17.12. Let w = (wk), m = m(n), r amd be as in Theo- 



10.41 ^.''T'd le.t d := span(w). If < X < u, or X = u and a < 00, then, 



rem 

for m = Xn + o{^/n) with m = (mod d), 

Z{m, n) ~ ^J=$(r)"r-™. (17.14) 
v27r cT^n 

Proof. Again it suffices to consider tfie case of a probability weiglit sequence 
witli T = <l>(r) = 1; this time using (110. 9j) . In this case the result is by (|10.5p 
equivalent to 

which again is the local central limit theorem and follows e.g. by Lemma [13.1l 
and Remark 113.21 □ 

Remark 17.13. The asymptotic formula (I17.14p holds for arbitrary m = 
m{n) with Q<c^m/n^C<Ljj and m = (mod d), and either C < u 
01 C = u and ^" {p) < 00 (which means that ^'{p) < 00 and thus the 
distribution (110.131) has finite variance for r = z^), provided r is replaced 
by T{m/n) given by ^{r{m/n)) = m/n. (Cf. Theorem 110.61 ) The proof is 
essentially the same (as in the proof of Theorem 110.61 it suffices to consider 
subsequences where m{n)/n converges); we omit the details. 

In the case v = X {u = 1 in the tree case) and cj^ = 00, we have no 
general results but we can obtain similar more precise versions of Theorems 
117.61 and ll7.1l in the important case of a power-law weight sequence, Exam- 
ple [UTTni (We need 1 < a ^ 2 here; if a ^ 1, then 1/ = 00 > A, and if 
a > 2, then o"^ < 00 so Theorems 117.111 and 117.12) apply, see Example lll.lOl 
with (3 = a + 1. Note also that span(w) = 1.) The case X> v \s treated in 
Theorem [1833] and Remark [1831 

Theorem 17.14. Suppose for some c > and a with 1 < a ^ 2, 

Wk ~ ck~°'~^ as k ^ 00. (17.15) 

(i) If V = 1, then, 



and 

~ f^V^^^(l)"n-3/2(logn)-i/2, when a = 2. (17.17) 
(ii) If m = vn + o{n^^'^), then 

^<™' ~ cV..r(-!)'.'C(-i/.)| ^"'""""°- 

(17.18) 
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and 

Z(m,n)~(^^) / ^ when a = 2. (17.19) 

V TTC / ynlogn 

Proof. This time, we did not assume t^o > 0, but we may do so without loss 
of generahty in the proof. In fact, if wq = 0, then > 1, so in (i) we always 
have Wq > 0, and in (ii) we can reduce to the case > by the method in 
Remark [Tall 



fohows from Theorem 114.51 and (ii) , taking m = n — 1; hence it suffices 



to prove (fT7l8D - (fT7T9D . 

We have p = 1, and in the usual notation A = and thus t = p = 1. 
We reduce to the probability weight sequence case by dividing each Wk by 
<^(1) (which changes c to c/$(l)). Let ^ he a random variable with the 
distribution (vr^) = (tffc). Then = i/. Furthermore, (117. 15p yields 



oo 



P(^ ^k) = J2wi^ ca-^k-'^. (17.20) 



l=k 



Hence ^ is in the domain of attraction of an a-stable distribution, see iFellei 
[3^ . Section XVII. 5]. More precisely, if we first consider the case 1 < a < 2, 
then there exists an a-stable random variable Xq, such that 

^AX„, ,17.21) 

(The distribution of is given by (118.931) and (118.11311 below .) More- 
over, a local limit law holds, see e.g. iGnede nko and Kolmogorov |46l. ^ 501. 
Ilbragimov and LinnikI 54 . Theorem 4.2.1] or Bingham. Goldie and Teugeld 
id. Corollarv 8.4.3]. which says 

nSn=i) = +0(1)), (17.22) 

uniformly for all integers £, where g is the density function of X^- In par- 
ticular, 

Z{m, n) = F{Sn = m) ~ n-^/"c/(0). (17.23) 

The results in [s^. Sections XVII. 5-6] show, if we keep track of the constants 
(see e.g. ^] for calculations), that 

5(0) = (cr(-a))-i/"|r(-l/a)ri, (17.24) 

and (fT7l8]) follows. 

In the case a = 2, Section XVII. 5] similarly yields 

AiV(0,c/2); (17.25) 

y n log n 

again a local limit theorem holds by [54, Theorem 4.2.1] or \ld, Corollary 
8.4.3], and thus 
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uniformly in £ G Z, where now g{x) is the density function (vrc) ^^"^e ^^^'^ 
ofN{0,c/2). In particular, 

Z{m, n) = F{Sn = m) ~ ^ g(0) = • (17.27) 

V n log n \Jn log n ^J'nc 

which proves (jl7.19p . □ 

Remark 17.15. The proof shows that (|17.15p can be relaxed to (|17.20p 
together with span(w) = 1. 

Example 17.16. Let „ be the number of labelled unrooted forests with 
m labelled nodes and n labelled trees, see Example 111.71 Using the weights 
Wk = kf'^'^ /k\ and Wk = e~^Wk ~ (27r)~^/^A;~^/^, we have by (|11.3ip and 

mm 

F^^^ = m\ Z{m, n; w) = ml e"'Z{m, n; w). (17.28) 
At the phase transition m = 2n, Theorem 117.141 applies to w with a = 3/2. 
We have c = (27r)-i/2 ^^^^ (fT02D . $(1) = <^{p) = 1/2. Hence IHJJ^ 
yields, after simplifications, 

= Z(2n, n; w) = e2"Z(2n, n; w) _i^e2"2-"n-2/3. (17.29) 

2n! V > . ; v . . 7 r(l/3) ^ ^ 

(The constant can also be written 2~^/^3"'^/^7r~"^r(2/3).) A more general 
result is proved by the same method by Britikov [13]. Flajolet and Sedgewick 
(iol . Proposition VIII. 11], show (117. 29p by a different method (although there 
is a computational error in the constant given in the result there). 

We end this section by considering the behaviour of the generating func- 
tion 2{z) := YlnLi Znz"'. The following immediate corolla ry of Theo- 



rem ll7.6l was shown by Otter (93|, see Minami [89|] and, for u > l. lFlajolet and Sedgewick 



[40|, Proposition IV. 5]. See also also Remark 17.51 

Corollary 17.17. Let {wk)k^o and r be as in Theorem ]?. 1\ and let pz he 

the radius of convergence of the generating function Z{z) := X^nLi ■^n-^"- 
Then pz ■.= t/^{t). □ 

Moreover, by (j7.6p . Z{pz) = r < oo. Since the generating function 
Z[z) has non-negative coefficients, it follows that Z{z) is continuous on the 
closed disc 1^1 ^ pz, and \Z{z)\ ^ r there. If we, for simplicity, assume that 
span(w) = 1, then |-Z(2:)| < \Z{pz)\ = r for \z\ ^ pz, z / pz- Since \Z\ < r 
implies 



|$(Z) - Z^'{Z)\ 



k=l 

oo 



Wo - ^(A: - l)wkZ^ > Wo- - l)wk\Z 



k 
k=l 



> Wo - ^{k - l)wkT^ = $(r) - r$'(r) = 0, 

k=l 

it follows that $(Z) - Z<^'{Z) / if Z = Z{z) with \z\ = pz, z ^ pz; hence 
the implicit function theorem and (13.13P show that Z{z) has an analytic 
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continuation to some neighbourhood of z. Consequently, Z then can be 
extended across \z\ = pz everywhere except aX z = pz- (If span(w) = d, 



the same holds except aX z = pze 



2TT\j/d 



In our case la (i^ > 1, or equivalently t < p), much more is known: Z has 
a square root singularity at pz with a local expansion of Z(z) as an analytic 



function of y'l 



where, with cr^ 



z/pz- 

Z{z)=T-by^ 

= Var ^ given by (jS.ip , 



z/pz + 



'2^>(t) 
$"(r) 



V2'- 



a 



(17.30) 



(17.31) 



see Meir and Moon 



Flaiolet and Sedgewickl [4d . Theorem VI. 6] and 



Drmotal [H, Section 3.1.4 and Theorem 2.19]; in particular, Z then extends 
analytically to a neighbourhood of p cut at the ray [p,oo). In fact, this 
extends (in a weaker form) to the case u ^ 1 and < oo (case la): (117.30^ 



holds in a suitable region, with an error term 0(1^1 — z/pz), see Janson [5 

Remark 17.18. In the case v > 1, (jl7.30p and (117.310 vie ld another proof of 
(117.131) by standard singular i ty analysis, see e.g. iDrmotal [331 . Theorem 3.6] 
and iFlaiolet and Sedgewickl [40|, Theorem VI. 6 and yil.2]; th is argument 
can be extended to the case v ^ 1 and o"^ < 00, see Drmota [H, Remark 
3.7] and Janson (sol . Appendix]. When f > 1, an expansion with fur ther 
terms can also be obtained, see Minami [s^ and Flaiolet and Sedgewick 40, 



Theorem VI. 6]. 

In the other cases (a^ = 00 or < 1), the asymptotic behaviour of Z 
at the singularity pz depends on the behaviour of ^{z) at its singularity 
p. It seems difficult to say anything detailed in general, so we study only a 
few examples. We assume ^ 1 and a; > 1; thus Lemma l3.ll implies that 
p < 00, ^{p) < 00 and ^'{p) < 00. We assume also p > and span(w) = 1. 

Example 17.19. Suppose that < p < 00 and that ^{z) has an analytic 
extension to a sector DpS := {z : \ arg(/9 — z)\ < 7r/2 + 6 and |-z — p| < S} 
for some 6 > 0, and that in this sector -Dp,5, for some a 7^ and non-integer 
a > 1, and some f{z) analytic at p (which can be taken as a polynomial of 
degree < a), 

$(z) = /(z) + a(/)-z)" + o(|p-z|"), asz^p. (17.32) 

(We have to have a > 1 since ^'{p) < 00. For a ^ 2 integer, see instead 
Example ll7.20[ ) If we assume that $ h as no further singularitie s on \z\ = p, 
this implies by singularity analysis, see Flaiolet and Sedgewick (io . Section 
VI.3], 

Wk ~ T^T^A;-"- as A: ^ 00. (17.33) 



r(-a) ^ ' 

The converse does not hold in general, but can be expected if the weight 
sequence is very regular. For example, (|17.32p holds (in the plane cut at 
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[p,oo)) if Wk = {k + k ^ 1, as in Example 19 .7^ with /3 = a + 1 > 2, 

p = I and a = T{—a), see e.g. [40, Section VI. 8]. 
Let F{Z) := Z/$(Z), so (f3T3]l can be written 

F{Z{z)) = z. (17.34) 

Since ^ 1, we have t = p, and thus by Corollary 117.171 and (j7.6p = 
F(r) = F{p) and ^^(/O^) = P- Note that 

cD(p)-pcD^(p) 1 - ^{p) l-u 

= Hp? = -HpT = Hp)- ^'''''^ 

If 1/ < 1, then (117.351) yields F'{p) > and (I17.34p shows that p - Z{z) ~ 
F'{p)^^{pz — z) as z ^ Pz- Moreover, F is defined in a sector -Dp,5, and its 
image contains some similar sector Dp^^s' (with < 6' < 6) such that Z{z) 
extends analytically to Dp^^s' by (|17.34l) . and it follows easily by (117. 34p 
and (jl7.32p that in Dp^^s', with some fi{z) analytic at pz, 

Ziz) = hiz) + aiipz-z)'' + o{\pz-z\''), as z^pz, (17.36) 

where 

(l-l/)°+l ^ ^ 

As noted above, Z{z) has no other singularities on |z| = p, and singularity 
analysis [13] applies and shows, using (117. 33p . 

" r(-Q) (1 - l/)"+l n V ; 

However, we will show in greater generality in Theorem 118.331 and Re- 
mark 118.341 (by a straightforward reduction to the case p = 1 using (j4.3|) ) 
that (|17.33p always implies p7.38p when u < 1, without any assumption 
like on $(z). 

If = 1, we assume 1 < a < 2, since (|17.32p with a > 2 implies ^"(p) < 
oo and thus o"^ < oo, so (|17.3Up and Theorem 117.111 would apply. We now 
have F'{p) = 0, and (I17.32p - (117.34j) yield, in some domain Dp^^s', 

Singularity analysis yields 

However, we have already proved in Theorem I17.14((I)] (assuming p = 1, 
without loss of generality) that (|17.33p implies (|17.40p in this case, without 
any assumption like (117. 32p on ^{z). 

Example 17.20. If a ^ 2 is an integer, (|17.32p does not exhibit a singu- 
larity. Instead we consider w with, for some / analytic at p, 

$(z) = fiz) + aip - zY \og{p -z) + 0{\p- zD , (17.41) 
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as z — )• /o in some sector Dn fi- This includes the case Wk = {k + 1) " , see 



Flajolet and Sedgewicld [401 . Section VI. 8]. 



In the case < 1, we obtain as above 

Z{z) = h{z) + aiipz - zT \og{pz - ^) + 0{\pz - ^D, (17.42) 

as z — )■ />2 in some sector, with f\{z) analytic at pz and a\ given by (117. 37p . 
We again obtain by singularity analysis 

~ (l-C)«+i ^(^^""'^"' ^^^-^^^ 

which is another instance of (jlS.llSp . 

In the case = 1, we consider only a = 2, since cr^ < oo if a > 2. Then 
(jl7.4ip yields (we have a < in this case) 

Z{z) = p - (1 - zipz)"' (- log (1 - zlpz))-'l' + .... (17.44) 

Singularity analysis fi^. Theorems VI. 2-3] gives another proof of (|17.17p in 
the special case (117. 4ip (again assuming /) = 1, as we may). 

Example 17.21. Define w by $(z) = wo + Y.T=o 2~^-^2^' , for some wq > 0; 
thus supp(w) is the lacunary sequence {0} U {2-'}. Then p = 1, ^{p) = 
Wo + 4/3 and ^'{p) = 2; hence u = "^{p) = 2/{wo + 4/3). The function ^{z) 
is analytic in the unit disc and has the unit circle as a natura l bou ndary; it 
cannot be extended analytically at any point. (See e.g. iRudin 101 . Remark 
16.4 and Theorem 16.6].) 

Taking wq > 2/3, we have z/ < 1; hence, F'{p) > by (jl7.35p . Thus 
F maps the unit circle onto a closed curve T that goes vertically through 
-F(l) = Pz, and since F cannot be continued analytically across the unit cir- 
cle, Z{z) cannot be continued analytically across the curve F. In particular, 
2{z) is not analytic in any sector Dp^^s'- 



18. Largest degrees and boxes 

Consider a random allocation Bm,n = (^i, ■ ■ ■ ,Yn) and arrange Yi, . . . ,Yn 
in decreasing order as y(i) ^ Y(^2) ^ • • • • Thus, Y(^i^ is the largest number of 
balls in any box, Y(^2) is the second largest, and so on. 

By Lemma 116.11 we may also consider the random tree Tn by taking 
m = n — 1; then y(i) is the largest outdegree in Tn, 5^(2) is the second largest 
outdegree, and so on. 

As usual, we consider asymptotics as n — )• oo and m/n — A. (Thus A = 1 
in the tree case.) We usually ignore the cases m/n — )• and m/n — )■ oo; these 
are left to the reader as open problems. (See e.g. Kolchin, Sevast'yanov and 
Chistyakov [ttI ]. Kolchin [76], Pavlov ['Od] and Kazimirov [70] for examples 
of such results.) 

The results in Sections [7] and [10] suggest that Y(i) is small when A < z^, 
but large (perhaps of order n) when A > i^, which is one aspect of the phase 
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transition at A = z^. We will see that this roughly is correct, but that the 
full story is somewhat more complicated. 

We study the cases X ^ u and X > u separately; we also consider sep- 
arately several subcases of the first case where we can give more precise 
results. 

We first note that the case cj < oo, when the box capacities (node degrees 
in the tree case) are bounded is trivial: w.h.p. the maximum is attained in 
many boxes. 

Theorem 18.1. Let w = {wk)k^o be a weight sequence with wq > and 
oj < oo. Suppose that n — )• oo and m = m{n) with m/n — >• A > 0. Then 
Y(^j-j = (jj w.h.p. for every fixed j. 

Proof. Clearly, each Yi ^ so Y(^j-^ ^ Y^i-^ ^ uj. 

We assume tacitly, as always, that -Bm,n exists, i.e. Z(m, n) > 0, and 
thus m ^ con, so A ^ w. By Theorem 110.41 if X < uj, and Remark 110.101 if 
X = UJ, N^{Bm,n)/n — ^ -Ku > 0. In particular, N^{Bm,n) oo, and thus 
P(Y(j) =uj)^\. ' □ 

18.1. The case X ^ u. In the case A ^ z^, we show that, indeed, all Yi are 
small. Theorems I18.2ffl8.3l vield (w.h.p.) a bound o{n) when X = v, and a 
much stronger logarithmic bound O(logn) when X < u. (In the tree case, 
we have A = 1, so these are the cases v = \ and v > 1.) 

Example 118.271 shows that in general, the bound o(n) when A = z/ is 
essentially best possible; at least, we can have Y(i) > w.h.p. for any 
given e > 0. 

Theorem 18.2. Let w = {wk)k^o be a weight sequence with wq > and 
Wk > for some k ^ I. Suppose that n — )■ cx) and m = m{n) with m/n ^ X 
where 0^A<oo. If X^u, then Y(i) = Op(n). 

Equivalently, Y(^i^/n 0. 

Proof. The case A = is trivial, since Yj-x^/n ^ m/n ^ X. The case A = a; is 
also trivial, since then uj < oo and Y(i^ ^ oo. As above, A > u; is impossible. 
Hence we may assume < A < w and u ^ X > 0, which implies r > 0, where 
\I'(r) = A, cf. Theorem 110.41 We may then for convenience replace {wk) by 
the equivalent weight sequence (vrfc) in (110. 13^ : we may thus assume that w 
is a probability weight sequence with t = 1, and thus p ^ t = 1, and then 
the corresponding random variable ^ has = A. 
By (117.111) and symmetry, for any k ^ 0, 

™. N ™. X wuZ(m — k,n — I) 

P(F(i) =k)^ nF{Y^ = k) = n ^. (18.1) 

Furthermore, Wk = T^k = = ^) ^ 1 and, using Example 110.21 Z{m,n) = 
P(5„ = m) = e"^""^ by Lemma Eil (p > 1) or [133] (p = 1)- We turn to 
estimating Z(m — k,n — 1). 
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Let < e < A, and define by "^{r^) = X — e. Since ^'(t) = A, we have 
< Te < r = 1. 

For each n, choose k = k{n) G [en,m] such that Z{m — k,n — 1) is 
maximal. We have e ^ k/n ^ m/n — )• A; choose a subsequence such that 
k/n converges, say k/n ^ j with e ^ 7 ^ A. Then, along the subsequence, 
(m — k)/{n — 1) — )• A — 7. 

By Theorem 117.11 (and Remark 1 17. 21 ignoring the trivial case Z{m — k,n — 
1) = 0), using < 1, 7 ^ e and (lin.lGD . 

— log Z(m — A;, n — 1) — )• log inf ,^ ^ log inf / 

^ log mf = log ■ — " 



say, where Remark 110.51 shows that, since 7^ 1, 

Ce < log(^>(l)/l^-^) = 0. (18.2) 

We have shown that 

lim sup — log Z{m — k,n — 1) ^ (18.3) 

n— >oo n 

for k = k{n) and any subsequence such that k/n converges; it follows that 
(jl8.3p holds for the full sequence. In other words, 

log Z{nrL — k,n — 1) ^ c^n + o(n) (18-4) 

for our choice k = k[n) that maximises the left-hand side, and thus uniformly 
for all k € [en, m]. Using (|18.4p and, as said above. Lemma 113.41 in (jl8.ip 
we obtain, recalling (118. 2p . 

m 

P(y(i) ^ en) = P(y(i) = A;) ^ mne"-"+°(")e°(") = e^="+°(") ^ 0. 

k=£n 

In other words, for any e > 0, Y(i-^ < en w.h.p., which is equivalent to 
^(1) = Op{n). □ 

The following logarithmic bound when A < is essentially due to Meir 
and Moon [sg'] (who studied the tree case). 

Theorem 18.3. Let w = {wk)k^o be a weight sequence with wq > and 
Wk > for some k ^ 1. Suppose that n — )• cxd and m = m{n) with m/n — ?• A. 
Assume < X < u, and define r G (0,p) by ^'(r) = A. 

(i) Then t < p and 

^(1) ^ 1 / / ^ log ^ + OpOog (18-5) 

^ ' log(p/T) 

(ii) In particular, if p = 00, then Y(i) = Op(logn). 
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(iii) If further Wjj 



l/k 



1/p as k ^ oo, then, for every fixed j ^ 1 




log 



log(p/r)' 



1 



(18.6) 



Recall that l/p = limsup^_^oQ , see (|3.5p . so the extra assumption 



irregular. (The proof shows that the assumption can be weakened to P(^ ^ 



It is not difficult to show Theorem 1 1 8 . 3 1 direct Iv. but we prefer to postpone 
the proof and use parts of the more refined Theorem 118.71 b elow . in order to 
avoid some repetitions of arguments. 

We conjecture that Theorem 118.31 holds also for A = 0. Since then r = 0, 
this means the following. (This seems almost obvious given the result for 
positive A in Theorem 118.31 where the constant l/log(/>/r) — )• as A — )■ 
and thus r — ?■ 0, but there is no general monotonicity and we leave this as 
an open problem.) 

Conjecture 18.4. If p > and m/n — )• 0, then Y^^) = Op(logn). 

18.2. The subcase cr^ < oo. In the case o"^ := Var^ < oo (which includes 
the case X < u), there is a much more precise result, which says that, simply, 
the largest numbers Y(^i^ , Y^2) ■ ■ ■ asymptotically have the same distribution 
as the largest elements in the i.i.d. sequence ^i, . . . , (Provided we choose 
the distribution of ^ correctly, and possibly depending on n, see below for 
details.) In other words, the conditioning in Example 110.21 then has asymp- 
totically no effect on the largest elements of the sequence. (When cr^ = oo 
this is no longer necessarily true, however, as we shall see in Example ll8.27[ ) 

In order to state this precisely, we now assume that a; = oo (see Theo- 
rem [T8?T] otherwise) and < A ^ z^, and define as usual r by ^I'(r) = A, and 
let ^ be a random variable with the distribution in (110. 13p . 

If m/n ^ I/, we further define t„ by ^'(t„) = m/n, and let ^^"^ be the 
random variable with the distribution in (jl3.10p . We will only use Tn and 
^(") in the case X < h', so m/n ^ X < h' and r„ really is defined (at least for 

large n); furthermore Tn ^ t < p and ^("^ — ^ ^. 

We further let ^i, ... , and (when A < z^) ^["'\ . . . , ^1"^ be i.i.d. sequences 
of copies of ^ and respectively, and we arrange them in decreasing order 
as '^(i) ^ . . . ^ and ^l"^'' ^ . . . ^ Finally, we introduce the counting 
variables, for any subset A C No, 




unless the weight sequence is rather 




Na 



Na 



\{i <^ n : C^ e A}\, 



(18.7) 
(18.8) 

(18.9) 
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{Na and Na also depend on n, but as usual, we for simplicity do not show 
this in the notation.) Note that Na and N^^^ simply have binomial distri- 
butions Na ~ Bi(n,P(C G A)) and ivj^ ~ Bi(n,P(^(") G A)). 
We have 

y(,) ^ ^ %.+i,oo) < j, (18.10) 

and similarly for ^(j^ and Thus it is elementary to obtain asymptotic 
results for the maximum ^(^-j of i.i.d. variables, and more generally for ^(j) 
and .^IJp see e.g. Leadbetter, Lindgren and Rootzen [s^ ]. 

We introduce three different probability metrics to state the results. For 
discrete random variables X and Y with values in No (the case we are 
interested in here), we define the Kolmogorov distance 

dK{X, Y) := sup I F{X ^ x) - F{Y ^ x)\ (18.11) 

xeNo 

and the total variation distance 

dTY{X, Y) := sup I F{X £ A) -F{Y e A)\. (18.12) 

ACNo 

In order to treat also the case with variables tending to oo, we further define 
the modified Kolmogorov distance 

.-K(X.r):=sup l"'^-<-'-'<'^<-'>l . (18.13) 

For dx, we also allow random variables in No, i.e., we allow the value oo. 
(Furthermore, the definitions of (ix and dxv and the results for them in the 
lemma below extend to random variables with values in Z. The definitions 
extend further to random variables with values in R for dx , and in any space 
for dTV; but not all properties below hold in this generality.) 

Note that these distances depend only on the distributions C{X) and 
£(y), so d{C{X), C{Y)) might be a better notation, but we find it convenient 
to allow both notations, as well as the mixed d{X,C{Y)). 

It is obvious that the three distances above are metrics on the space of 
probability measures on No (or on Nq). 

We collect a few simple, and mostly well-known, facts for these three 
metrics in a lemma; the proofs are left to the reader. 

Lemma 18.5. (i) For any random variables X and Y with values in Nq, 
dK{X,Y) ^ d^{X,Y) ^ dTv{X,Y). 
(ii) For any X and Xi,X2, ■ ■ ■ with values in Nq, 

X„ A X ^ dTw{Xn,X) ^ ^ dKiXn,X) ^ 
^ dKiXn,X)^0. 
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(iii) For any X and Xi,X2, ■ ■ ■ with values in Nq, 

Xn^X ^ dK{Xn,X)^0. 



In particular, 



Xn — ^ OO <;=^ d^iXn, oo) — )• 0. 



(iv) For any Xn and X'^ with values in Nq, d^iXn, X^) — ?• <^=^ 
^ x) — P(X'j ^ x) I —7-0 for every fixed x ^ 0. 

(v) For any Xn and X'n, dj:\i{Xn,Xn) — s- <^=^ there exists a coupling 

{Xn-,X'^) with Xn = X'^ w.h.p. (We denote this also by Xn ~ ^n-) 

(vi) The supremum in (118. 12p is attained, and the absolute value sign is 
redundant. In fact, if A := {i : F{X = i) > F{Y = i)}, then dTYiX,Y) = 
P(X G A)-F{Y € A). 

(vii) For any X and Y with values in Nq, 

dTY{X,Y)= ^(P(X = x)-P(y = x))^ = i \ HX = x)-F{Y = x)\. 

□ 



Remark 18.6. The three metrics are, by Lemma [18. equivalent in the 
usual sense that they define the same topology, but they are not uniformly 
equivalent. For example, if Xn ~ Po(n), := 2[X„/2j (i.e., Xn rounded 
down to an even integer) and X^ ■= X'n + 1, then dK{Xn,Xn) — )• as 
n — )• oo, but dTy{Xn, X'n) = 1. 

We define Po(oo) as the distribution of a random variable that equals oo 
identically. 

After all these preliminaries, we state the result (together with some sup- 
plementary results). There are really two versions; it turns out that for 
general sequences m(n), we have to use the random variables with 
]g^(n) _ ji exactly tuned to m{n), but under a weak assumption we 

can replace ^("'^ by ^ and obtain a somewhat simpler statement, which we 
choose as our main formulation. (This goes back to Meir and Moon [87], who 



proved (i) in the tree case, assuming A < z/; see also Kolchin, Sevast'yanov 



and Chistyakov |77l . Theorem 1.6.1] and Kolchin [7a, Theorem 1.5.2] for Yj-i) 
in the special case in Example 111. 1[ ) 

Theorem 18.7. Let w = {wk)k^o be a weight sequence with wq > and 
oj = oo. Suppose that n — )■ oo and m = m{n) with m = Xn + o{^/n) where 
< \ ^ u, and use the notation above. Suppose further that cr^ := Var^ < 
oo. (This is redundant when X < v.) 

(i) If (possibly for n in a subsequence) h{n) are integers such that 
nP{^ ^ h{n)) — )• a, for some a S [0, oo], then 

%(n),oo) := |{^ : Yi > h{n)}\ A Po(a). 

(ii) If h{n) are integers such that nP{£, ^ h{n)) — t- 0, then w.h.p. Yj-i) < 
h{n). 
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(iii) // h{n) are integers such that nP{^ ^ ^('^)) ~^ oo, then, for every 
fixed j, w.h.p. Y(^j) ^ h{n). 

(iv) For any sequence h{n), dK(%(n),oo), ^[/i(n),oo)) 0. 

(v) For every fixed j, dK('i^O),^(j)) 0. 

(vi) dTv(%,C(i)) ^0. 

If \ < V, the condition m = \n + o{\/ri) can be weakened to m/n = 
X + 0(1/ log n). 

Moreover, if \ <v, then the results hold for any m = m{n) with m/n — t- 
X, provided is replaced by , N by N and by £,Qy 

Remark 18.8. In the version with we do not need A at all. By con- 
sidering subsequences, it follows that it suffices that 0<c^m/n^C<z^. 
(Cf. Theorem 110.61 ) Furthermore, this version extends to the case X = v 
and m/n ^ u, but we have ignored this case for simplicity. 



Problem 18.9. Is Theorem 18. 7\ (in the ^^"^ version) true also for A = < 



The total variation approximation in (vi) is stronger than the Kolmogorov 



distance approximation in (v) , and our proof is considerably longer, but for 
many purposes (v) is enough. We conjecture that total variation approxi- 
mation holds for every ^(j), and not just for ^{i); presumably this can be 
shown by a modification of the proof for Yj-i) below, but we have not checked 
the details and leave this as an open problem. Furthermore, we believe that 
the result extends to the joint distribution of finitely many Y(^jy (The cor- 



responding result in (v) using a multivariate version of the Kolmogorov 



distance, is easily verified by the methods below.) 

Problem 18.10. Does c^TV (^(j); ~^ hold for every fixed j , under the 
assumptions of Theorem \18. If 

Proof of Theorem \18.7\ As in the proof of Theorem 118.21 we may replace 
(wfc) by the equivalent weight sequence (vr^) in ()10.13p . We may thus assume 
that w is a probability weight sequence with r = 1, and thus p > r = 1, 
and the corresponding random variable ^ has = A. We consider first 
the version with ^, assuming m = Xn + o{^/n), and discuss afterwards the 
modifications for 

We begin by looking again at (jlT.lip : 

P(ri = k) = "^^(^-''"-^^ (18.14) 

When m = Xn + o{\/n), we may apply Lemma 113.11 and Remark 113.21 and 
thus, with d := span(w), 

Z{m, n) = F{Sn = m) = (18.15) 

v27r cT^n 
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Moreover, by ()13.8p . for any k, 

Z{m -k,n-l)= P(5„_i = m - k) ^ d + o^^ ^^g -^g^ 

V 27ro"2n 

Consequently, (jl8.14p yields, uniformly for all k, 

F(Yi = fc) (1 + o{l))wk = (1 + o(l)) P(C = A;). (18.17) 

In particular, we may sum over k ^ K and obtain, for any K = K(n), 

P(yi ^ i^) ^ (1 + o(l))F(e ^ A'). (18.18) 

Since, by assumption, E,^^ < oo, we have P(^ ^ K) = o{K^'^) as AT — )• oo. 
Hence, for every fixed 5 > 0, P(^ ^ S^/n) = o{n^^). It follows that there 
exists a sequence 5^ — )• such that F(^ ^ Sn\/n) = o(n~^). Consequently, 
defining B{n) := Sn\/ri, we have B{n) = o{y/n) and 

F{C^ B{n)) = o{n-^), (18.19) 

and thus, by (I18.18P and symmetry, 

P(y(i) ^ B{n)) ^ nP(Yi ^ B{n)) = n(l + o(l)) P(^ ^ S(n)) = o(l). 

(18.20) 

Hence, Y(i) < B{n) w.h.p. 

Similarly, P(^(i) ^ B(n)) ^ nP(^i ^ B{n)) = o(l), so ^(i) < B{n) w.h.p. 
Write, for convenience, := A^[/i(n),B{n)] ) aiid note that w.h.p. Y(^i^ ^ 



B{n) and then = A^[h(n),oo)- (We assume for simplicity h{n) ^ B{n); 
otherwise we let := 0, leaving the trivial modifications in this case to the 
reader.) 

Moreover, for k ^ B{n) = o{y/n), we have (m — k) — (n — 1)A = o{y/n), 
and thus Remark 113.21 shows that, for any k = k{n) ^ B{n), 

Z{m -k,n-l)= F{Sn-i = m - k) = ^±^. (18.21) 

V 27rcr^n 

Since we here may take k = k{n) that maximises or minimises this for k ^ 
B(n), it follows that (jl8.2ip holds uniformly for all k ^ B{n). Consequently, 
by (fTTO|) . (flATSD and (fTCT]) . 

P(yi = fc) = (1 + o(l))u;fc = (1 + 0(1)) F(e = k), (18.22) 

uniformly for all k ^ B{n). By the assumption and ()18.19p . this yields 

B{n) B{n) 

EN = n F{Yi = k) = n ^ (l + o(l)) P(C = /c) 

k=h{n) k=h{n) 

= (1 + o(l))nP(/i(n) ^ ^ ^ B(n)) 

= (1 + o(l))n(P(^ ^ /i(n)) - P(^ > B{n))) a. 
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Similarly, again using the symmetry as well as Lemma 113.11 and Re- 
mark [l321 

EiV(iV- 1) =n{n-l)F{Yi,Y2 £ [h{n),B{n)]) 

B{n) 

= n{n - 1) ^ P(Yi = ki and Y2 = /cs) 

I 1^ V^'' Wk^Wk^Zim - ki - k2,n - 2) 

= n{n-l) y — - — — 

^ Z{m,n) 

ki,k2=h(n) 
B{n) 

= n(n-l) m = ki)ni = k2){l + o{l)) 

ki ■k2=h(n) 

= (1 + o(l))n2(P(e ^ h{n)) - ¥{i > B{n))Y 

Moreover, the same argument works for any factorial moment '¥^{N^i and 
yields E(A^)£ — )• for every 1^1. If a < 00, we thus obtain N — ^ Po(a) 
by the method of moments, and the result follows, since N = -A'^[/i(n),oo) 
w.h.p. 

If a = 00, this argument yields 

E(iV)£ ~ (nP(C ^ /i(n)))^ ^ 00 (18.23) 

for every ^ ^ 1, and we make a thinning: Let A be a constant and let 
q := A/(nP(^ ^ /i(n))); then q — )• A/a = 0. We consider only n that are 
so large that q < 1. We then randomly, and independently, mark each box 
with probability q. Let A^' be the random number of marked boxes i such 
that Yi £ [h{n),B{n)] . Then, for every £ ^ 1, using (fT8:23D . 

E(iV')^ = in)iq^F{Yi,...,Yi G [h{n),B{n)]) =q^E{N)i^A^. (18.24) 

Consequently, by the method of moments, A^' Po{A). In particular, this 
shows, for every fixed x, 

P(iV < x) ^ P(iV' <x) P(Po(yl) < x), 

which can be made arbitrarily small by taking A large. Hence, P(A^ < x) — t- 
for every fixed x, i.e., N 00 and thus -/V[^(„) 00, as we claim in 

this case. 

(ii) Part (i) applies with q = 0, and yields N^h{n),oo) which means 



%(n),oo) = W.h.p. Thus y(j) < hin) w.h.p. by (fmoll . 



Part (i) applies with a = 00, and yields N[fi(n),oo) — ^ 00. Thus, for 



every fixed j, by (fmo]), P(y(,) < h{n)) = P(7V[/,(„),oo) < j) ^ 0. 
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(iv) Suppose not. Then there exists a sequence h{n) and an e > such 



that, for some subsequence, 

(^K(A^[/i(n),oo):^[/i(n),oo)) > ^- (18.25) 

We may select a subsubsequence such that nP(^ ^ h{n)) — )■ a for some 
a G [0, c«]; then dK{N^h{n),oo):^^{(^)) — by (i) and Lemma ri8.£|(iii) More- 
over, along the same subsubsequence, A^[h(n),oo) ~ Bi(n,P(^ ^ h{n)^ 
Po(a), by the standard Poisson approximation for binomial distributions 
(and rather trivially if a = oo); hence (iK(-/V[h(ra),oo)) Po(ck)) ~^ 0- The trian- 
gle inequality yields c^k (-^[?t(n),oo)) -^[h(n),oo)) ~^ along the subsubsequence, 



which contradicts p8.25|) . This contradiction proves (iv) 



v) Suppose not. Then, by ()18.1ip . there is an e > and a subsequence 



such that for some h(n), 

|P(y(j-) ^ h{n)) - P(e(j) ^ h{n))\ ^ e. (18.26) 
However, by (fTSTOjl . (nAT3]l and 



IV, 



|p(%^M^))-neo-)^/^(n))| 

= |P(%(„)+i,oo) ^ J - 1) - IP(^[h(n)+l,oo) ^ i - 1)1 
^ jrfK(^^[/i{n)+l,oo)i ^[fe(n)+l,oo)) 0, 



which contradicts (jl8.26p . This contradiction proves (v^ 



(^ Let A = A{n) := {i : F{Y^j-j = i) > P(^(j-) = i)}; thus, see 



Lemma 118. ^ l^vi^ 



^Tv(l(,>e(i)) = HYu) G A) - P(e(,) G ^). (18.27) 

Let 6 > 0. For each n, we partition Nq into a finite family V = {Ji}iLi of 
intervals as follows. First, each i E Nq with P(^(i) = i) ^ (^/2 is a singleton 
{i}; note that there are at most 2/6 such i. The complement of the set of 
these i consists of at most 2/5 + 1 intervals Jk (of which one is infinite) . We 
partition each such interval Jk further into intervals J/ with P(^(i) G Jz) ^ 5 
by repeatedly chopping off the largest such subinterval starting at the left 
endpoint. Since only points with P(C(i) =0 < ^^/^ remain, each such interval 
Ji except the last in each satisfies P(C(i) G Ji) > 5/2. Hence, our final 
partition {Ji} contains at most 2/5 + 1 intervals Ji with P('^(i) G Ji) < 5/2, 
while the number of intervals Ji with P('^(i) ^ Ji) ^ 5/2 is clearly at most 
2/5. Consequently, L, the total number of intervals, is at most 4/5 + 1. 

We write J; = [ai,bi]. We say that an interval Ji G V is fat if P(^(i) G 
Ji) > 5, and thin otherwise. Note that by our construction, a fat interval is 
a singleton {ai}. 

Next, fix a large number D. We say that an interval J/ = [(1^,6;] G "P is 
good if nP(^ ^ ai) ^ D, and bad otherwise. 
For any interval Ji, 

(1) G Ji) - P(e(i) G Ji)\ ^ 2dK(r(i),e(i)) = 0(1) (18.28) 
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by [Ml 

Let Ai := An Ji. Thus A is the disjoint union IJ;^/- (^i -^i 
depend on n.) 

We note that if J/ is fat, then Ji is a singleton, and either Ai = Ji or 
Ai = 0; in both cases we have, using (118.28^ . 

P(y(i) G Ai)-ni{i) e A) ^ 2dK(Y(i),e(i)) = (18.29) 

We next turn to the good intervals. We claim that, uniformly for all good 
intervals J/, as n — )• oo, 

P(y(i) G Ai) ^ e^'" P(e(i) e Ai) + o(l). (18.30) 

As usual, we suppose that this is not true and derive a contradiction. Thus, 
assume that there is an e > and, for each n in some subsequence, a good 
interval J/ = [a;, 6;] (depending on n) such that 

P(y(i) e A) > e^''' ni{i) e A) + (18.31) 

If Ji is fat, then (118. 3ip contradicts (118. 29p for large n, so we may assume 
that Ji is thin, i.e., P(^(i) G Ji) < 5. 

Let := J; \ Ai and := [hi + l,oo). Let a„ := nP(^ G ^i), /3„ : = 
nP(^ G i?;) and 7„ := nP(^ G A^). The assumption that J/ is good implies 
that a„ + /?„ + 7„ = nP(^ ^ a/) ^ Z). By selecting a subsubsequence we 
may assume that — )• a, /3n — )• /? and 7n — ^ 7 for some real a, /3,7 with 



d 



a + /3 + 7 ^ D. Then (i) shows that — > Po(/3); moreover, the proof 

extends easily (using joint factorial moments) to show that N^i Po(q), 

Po(/3) and A^aj^ — ^ Po(7), jointly and with independent limits. 
Similarly, by the method of moments or otherwise (this is a standard Pois- 



son approximation of a multinomial distribution), A^^^ — > Po(a), N Bi 

Po(/3) and N a\ ^^{l)-, jointly and with independent limits. 
Note that 

Ai ^ and Nb^ = 0. 

Conversely, 

Na, ^ 1 and Ab, = Aac = =^ ^(i) G A- 

The corresponding results hold for ^(x)- Thus, 

P(y(i) G Ai) ^ P(Aa, ^ 1, Ab, = 0) ^ P(Po(a) ^ l) P(Po(/3) = O) 

(18.32) 

and 

iP(e{i) e A) ^ P(iVA, ^ 1, As, = Aac = 0) 

^ P(Po(a) ^ 1) P(Po(/3) = O) P(Po(7) = O) . (18.33) 
Since P(Po(7) = O) = e-^, (fT8:32]l - (fT8:33D yield 

P(y(i) G Ai) - P(e(i) G A) ^ 0(1). (18.34) 
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Moreover, Nj^ = Nj^^ + Nj^c Po(a + 7), and thus 

P(e(i) G Ji) = HNj, ^ 1, Nb, = 0) ^ ¥{Nj^ = 1, Nb, = 0) 

^ (a + 7)e-°-^e-^. (18.35) 

We are assuming that Ji is thin, i.e., S J;) ^ 5, and thus (|18.35p 

yields (a + 7)e~"~'^e~^ ^ 5 and consequently 

7 < a + 7 ^ 5e"+'^+^ ^ Je^^. 

Hence, (|18.34p implies 

P(y(i) G < e^^'' P(^(^) g AO+o(l), 

which contradicts (|18.3ip . This contradiction shows that (|18.30p holds uni- 
formly for all good intervals. 

It remains to consider the bad intervals. 

Let j£ = [a^, b(] be the rightmost bad interval. If is fat we use (|18.29p 
and if is thin we use (118. 28p which gives 

P(y(i) G Ae) ^ P(y(i) G Je) ^ P(^(i) G Je) + o(l) ^ 6 + o(l). 

In both cases, 

P(y(i) G Ae) ^ P(^(i) eAe) + 6 + o(l). (18.36) 
Finally, let A* be the union of the remaining bad intervals. Then A* = 



[0, ai — 1] and by (v) 



P(y(i) G A*) = P(y(i) < ae) ^ P(C(i) < a,) + o(l). (18.37) 
Furthermore, recalling nP(^ ^ a^) > D since Ji is bad, 

P(C(i) < ae) = m[a„oo) = 0) = (1 - P(e > a,))" ^ e-"i^(«>"^) ^ e"^. 

(18.38) 

We obtain by summing (|18.30p for all good intervals together with (118. 36p 
and (118. 37p . recalling that the number of intervals is bounded (for a fixed 
6) and using (|18.38l) . 

p(y(i)GA) = ^p(y(i)G^O 

^ e'^'' Y^n^H) e Ai) + 0(1) + 6 + P(C(i) < a,) 
I 

^e^^'' P(C(i) G^)+o(l)+5 + e-^. 

Consequently, 

^iTv(^"(i),^(i)) = F(y(i) G A) -P(e{i) G ^) 

^ (e'5^'' - 1) P(e(i) G ^) + <5 + e-^ + 0(1) 
^ (e'^^'' -1) + 5 + e-^ + o(l). 
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and thus 

limsup(iTv(%,e(i)) < {e^^" - l)+5 + e-^. (18.39) 

71— ^-OO 

Letting first 5 — )■ and then D — oo, we obtain dTv(^{i);C(i)) ~^ 0) 



which proves (vi) 



This completes the proof of the version with ^ and the assumption m = 
An + o{n^/'^). Now remove this assumption, but assume X < u and thus 
T < p. We consider only n with ^ < m/n < v and thus < t„ < p. Denote 
the distribution (113.101) of C^") by vir(") (this is a probability weight sequence 
equivalent to w) and let S'i"'* := ^j"'* + • • • + ^1"^ Then, by Example 110.21 
applied to w^"), in analogy with (jl8.14p (and equivalent to it by (|10.9p ). 

^j^^ ^ ^ 4"'Z(,„-;C.„ l;wW) ^ Z(„.-t,„-l;wW) p^^,„, ^ 

Z(m, n; wf."-)) Z(m,n;w'")) 

(18.40) 

Furthermore, for any y ^ 0, using ()13.12p . 

P(y(i) > y) ^ n¥{Yi ^ y) = nP(^S"^ ^ y | S^") = m) 

^ nP(^i"^ ^ y) F(5^") = 
^ nP(^(") ^ y) • 0(n^/2). (18.41) 
Choose r* G {t,p)- Then, for s > and n so large than Tn < t^, by (14. lip . 

PK(...>,,(,-.,^,,-.»2|i). (18.42, 

Choosing s > with e'^ < /o/r,,,, we thus find P(^(") ^ y) = Oie-'y) and, by 

P(y(i)^y) = 0(n3/V^S'). 
We now define B(n) := 2s~^ logn, and obtain 

P(Y(i) ^ B{n)) = 0(n3/2e-^^(")) = 0{n-^/^) ^ 0. (18.43) 
Hence, Y(i) < i?(n) w.h.p. Similarly, using (|18.42p again, 

P(e^") ^ B{n)) = o(n~^) (18.44) 

and thus P(c[")^ ^ S(n)) ^ nP(^(") ^ S(n)) ^ 0, so < -B(n) w.h.p. 

We have shown that (118.19!) (with ^(")) and (118.20!) hold. Moreover, 
Lemma [13.1l vields. see (!13.12p again, Z{m,n;w^ ")) ~ d/(27ra2n)V2^ and for 
k ^ i?(n) = O(logn), the same argument yields also, using Remark 113.2! 
Z{m - k,n- l;w(")) ~ d/{2Tra'^n)^/'^ , because m - k - {n - 1)E^W = 
m — k — {n — l)m/n = —k + m/n = o(n^/^). Consequently, (!18.40p yields 

f>{Yi = k) = {1 + o(l)) P(^(") = A:), (18.45) 
uniformly for k ^ B[n). 
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We can now argue exactly as above, using ^("), ^[J^^ and n''^\ which 
proves this version of the theorem. 

Finally, if A < and m/n = A + o(l/logn), then := "^^^{m/n) = 
r+ 0(1/ log n), because is differentiable on (0, u). Since we are assuming 
r = 1 and <&(t) = 1 in the proof, we thus have, uniformly for all k ^ B{n) = 
O(logn), 

k 

= k) = ^^Wk = (1 + o{l))w, = (1 + 0(1)) P(e = A;). (18.46) 

Since also P(C(") ^ B{n)) = o(n-i) and ^ B{n)) = o(n-i), it follows 
that nP(.^*-"'^ ^ ^(^)) ~^ <;=^ nP(^ ^ ^(^)) ~^ Q^, and thus we may in 
(i) - (iii) I replace ^^"^ by ^ again. Finally, (iv) - (vi) follow as above in this case 
too. □ 



Proof of Theorem \18.cl[ Recall that X < v <^=^ r < p by Lemma 13.11 We 
have > A > 0, so p > 0, and r > 0. Thus 1 < p/r ^ 00. 

(1)1 Fix a > l/log(p/r). Choose b with e^/° < 6 < p/r. Then ^ 1/p < 



(6r)~^ Choose c with 1/p < c < (&r)"^ 

Since limsup;j_^oo ^fc^'^ = l/p < c, we have u'],'''^ < c for large k, and then, 
defining t„ and C^") by (fTMjl - lfmO]) . 



p(e(") = /t) 



^(t„) 



<i>(o) • 



(18.47) 



As n — )• 00, cr^ ct < b ^. Let /i := [alognj. For large n, (|18.47p applies 
for k ^ h, and C7:„ < b^^ < 1, and then 



^ /i) ^ V 



(cr„)^ 



k=h 



Wo 



Since a log 6 > 1, thus nP(^(") ^ /i) 
y( i) ^ /i ^ alogn w.h.p. 



0, and Theorem I18.7|(ii)| yields 



(ii) : li p = 00, then (i) applies with p/r = 00 and thus l/log(p/r) = 0. 



(iii) : If /) = cxD, the result follows by (ii) , so we may assume 1 = t < p < 



00. Let a := l/log(p/r) and < e < 1. The upper bound Y(j) ^ Y(i) ^ 
(a + e) log n w.h.p. follows from (i) , and it remains to find a matching lower 
bound. 

Let k := [(1 — e)alogn]. Then, since r„ — )• r. 



logP(^ 



in) 



k) = log Wk + k log r„ - log $(r„) 

= -A;(logp + o(l)) + fc(logr + o(l)) + 0(1) 
= —k\og{p/T) + o{k) = —(1 — e + 0(1)) logn 



and thus 



n 



^ fc) ^ nP(^(") = A;) =n"+°(i) 



00. 
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By Theorem 118.'/ Ifiii) (and the last sentence in Theorem 118. 7p . this implies 
w.h.p. 

^(j) ^ ^ ^ (1 ~ e)alogn 
This completes the proof, since we can take e arbitrarily small. □ 

Specialising Theorem 118.71 to the tree case (m = n — 1), we obtain the 
following. (Recall that o"^ < oo is automatic when u > 1.) 

Corollary 18.11. Let w = {wk)k^o be a weight sequence with wq > and 
Wk > for some k ^ 2, and let ^ have the distribution given by (vr^) in 
(17. ip . Suppose that v ^ 1 and cr^ := Var^ < oo. Then, as n ^ oo, for the 
largest degrees Y^^i^ ^ Y(2) ^ ••• in Tn, '^Tv(^(i); ■?(!)) — ^ and, for every 
fixed j, (iK(Y(j),C(j)) 0. 



Proof. The case a; = oo is a special case of Theorem 118.71 with A = 1. 

The case w < oo is trivial: for every fixed j, = uj w.h.p. by Theo- 
rem [T8Tl and, trivially, = oj w.h.p. □ 

The comparison with ^(j) in Theorem 118.71 and Cor oUar v 1 1 8 . 1 1 1 is appeal- 
ing since .^(j) is the j:th largest of n i.i.d. random variables. For applications 
it is often convenient to modify this a little by taking a Poisson number of 
variables instead. 

Consider an infinite i.i.d. sequence ^1,^2, . . . , let as above be the j:th 
largest among the first n elements of the sequence and define as the j:th 
largest among the first N{n) elements ^1, . . . ,^N{n)^ where N{n) ~ Po(n) is 
a random Poisson variable independent of .^1, • • • • 

Lemma 18.12. W.h.p. ^(j) = ^(j) and thus '^Tv(?(j); —^0 as n 00 
for every fixed j ^ 1 . 

Proof. Let n± := [n ± n^/^J , and let ^^"^ be the j:th largest of ^1, ... , ^ri,_ • 
By symmetry, the positions of the j largest among ^1, . . . are uniformly 
random (we resolve any ties in the ordering at random) ; thus the probability 
that one of them has index > n_ is at most j{n — n^)/n = o(l). Hence, 
w.h.p. all j are among ^1, . . . , , and then ^(j) = 

Furthermore, w.h.p. n_ ^ N(n) ^ n+, and a similar argument (using 
conditioning on N{n)) shows that w.h.p. = S.'^jy Hence, w.h.p. = 
^0) ~ Lemma ri8.^(v)[ □ 

We can thus replace ^(j) by ^(j) in Theorem 118.71 and Corollarv ll8.111 (We 

can similarly replace by defined in the same way.) The advantage is 
that, by standard properties of the Poisson distribution, the corresponding 
counting variables 

Nk:=\{i^N{n):C = k}\ 

are independent Poisson variables with Nk ~ Po(nP(^ = k)). We similarly 
define iV^^,^) := EZk^i ~ Po(nP(C ^ k)). 
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Remark 18.13. An equivalent way to express this is that the multiset 
H„ := {£,1 '■ i ^ N(n)} is a Poisson process on No with intensity measure An 
given by A„{fc} = nP(^ = k). 

We thus have (exactly), for any j and k, 

P(|(,) ^k)= < j) = P(Po(nP(C > k)) < j); (18.48) 

in particular 

P(|(i) s^k) = e-"^(5>'=). (18.49) 

This gives the following special case of Theorem 118.71 (There is a similar 
version with ^^"^.) 

Corollary 18.14. Suppose that wq > and u = oo. Suppose further that 
n — )• oo and m = m{n) with m = \n + o{y/n) where < A ^ z^, and that 
either \ < v or a'^ := Var,^ < oo. Then, uniformly in all k ^ 0, 

P(y(j) ^ A;) = P(Po(nP(^ > k)) < j) + o(l) (18.50) 

for each fixed j ^ 1; in particular 

P(y(^ ^k) = e-'^'^«>'=) + o(l). (18.51) 



Proof Immediate by Theorem [H^^v)] Lemma [I8T2] and (I18.48p - (|18.49l) . 

□ 

Remark 18.15. Since N^h( n),oo) ^ J *^=^ ^ ^(^); it follows easily 
from Lemmas 118.121 and I18.^(iv)| that for any sequence h{n), 

dK{N[h(n),oo),Nih(„),oc)) ^0. (18.52) 



Hence, Theorem 118.7 If iv) is equivalent to (-^[/i(n).oo)) -^[/i(n),oo)) ~^ 0) and 
thus 



dK(A^[Mn),oo),Po(nP(e ^ /i(n)))j ^ 0. (18.53) 
This is another, essentially equivalent, way to express the results above. 

18.3. The subcase A < z^. When A < z^, we have t < p and the random 
variable ^ has some finite exponential moment, cf. Section (H hence the 
probabilities vr^ decrease rapidly. Theorem 118.71 and Corollary 118.141 show 
that Y(x) (and each ^(j)) has its distribution concentrated on k such that 
P(^ ^ k) is of the order 1/n. If the decrease of vr/j is not too irregular, this 
implies strong concentration of ^(i), with, rougly speaking, Yj-^) ~ k when 
^ k) K, 1/n. To make this precise, we define three versions of a suitable 
such estimate k = k{n). Let, as above, vr^ = P(^ = k) = T^Wk/^ir) and let 

oo 

nfe:=P(e^A;) = ^7r,. (18.54) 

l=k 

Define 

ki{n) := max{A; : tt^ ^ ^/"n}, (18.55) 
k2{n) := max{A; : Uj, ^ 1/n}, (18.56) 
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ksin) := max{A: : y^UkUk+i ^ l/n}. (18.57) 

Note that ki{n) ^ k2{n) and k2{n) — 1 ^ ^^{n) ^ k2{n). 

We consider the typical case when Wk+i/wk converges as A; — ?■ oo. We 
assume imphcitly that Wk+i/wk is defined for all large k; thus u)fc > and 
uj = oo. If Wk+i/wk — a as — oo, then ()3.5p yields p = 1/a; hence p = oo 
if a = and 0<p<ooifa>0. 

Theorem 18.16. Suppose that wq > and that w^+i/wk a < oo as 
/c — 7- oo. Suppose further that n — t- oo and m = m[n) with m = \n + o{^/n) 
where Q < \ < v . 

(i) Then, for each j ^ 1, 

= ki{n) + Op(l) = k2{n) + Op(l) = k^{n) + Op(l). 

(ii) lfa = 0, then, moreover, w.h.p., 

1% - ki{n)\ ^ 1, 1% - k2{n)\ ^ 1, y(j.) G {A:3(n), /C3(n) + 1}. 

Proof, (i): We have, as said above, p = 1/a > 0. Furthermore, since A < z^, 
we have t < p and thus, as /c — )• oo, 

^=r^^ra=^-<l. (18.58) 

TTfc Wk P 

It follows from (|18.58p and (|18.54p . using dominated convergence, that, as 
k — >■ oo, 

i=0 1=0 

If £ is chosen such that (ra)^ < 1 — ra, then (jl8.59p and (jl8.58p imply 
Uk+e/T^k — ^ (™)^/(l ~ ''"o) < 1 as A; — )• oo, and thus, for large k, Hk+e < 
TTfc < II/j; hence, for large n, ki{n) ^ k2{n) ^ A;i(ra) +£. Thus, recalling that 
Mn) - k3{n)\ ^ 1, 

ki{n) = k2{n) + 0(1) = k^{n) + 0(1). (18.60) 

Furthermore, ()18.58p and (118. 59p yield also 

%ti^ra<l. (18.61) 
Ilfc 

By ()18.56p . nTlk2(n) ^ 1 > n'U.k2(n)+i- This and (I18.6ip imply that if i7(n) is 
any sequence with f](n) oo, then nYik2{ n)-n{n) ^ oo and nlikj ^^f^^^ ^ 
0. Consequently, recalling the definition ()18.54p . by Theorem I18.7[(il) - (iii) 
(or by Corollary [18.14p w.h.p. Y^j^ ^ k2{n) - VL{n) and Y(j) < fc2(n) + 0(n). 
Since Q.{n) — )• oo is arbitrary, this yields Y(j) = /c2(n) + Op(l). (See e.g. 
[ei].) The result follows by ([18:60]) . 

(ii): When a = 0, (118.591) yields Hfc ~ vTfc, (118.581) yields vrfc+i/vrfc and 
(118.611) yields lik+i/^k as k oo. It follows easily from (I18.55l) - (ll8.57p 
that nnfcj(„)_i oo, nnfc^(„)+2 0, nnfc2(„)_i oo, nnfc^(„) ^2 -> . 
nllk.j(^n) ~^ oo, nn;j3(„)4.2 — )• 0, and the results follow by Theorem I18.7|(ii) ' 



(iii), □ 
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If a = 0, i.e. Wk+i/wk — )• as A; — )• oo, thus Y(i) is asymptotically concen- 
trated at one or two values. (This was shown, in the tree case, by Meir and 
Moon [88], after showing concentration to at most three values in [sTt ]: see 



also Kolchin, Sevast'yanov and Chistyakov [77|, Kolchin ^76( and Carr, Goh 
and Schmutz [21] for special cases.) If a > 0, we still have a strong concen- 
tration, but not to any finite number of values as is seen by Theorem 118. 191 
below. 

We consider two important examples, where we apply this to random 
trees, so m = n — 1 and A = 1. (Recall that Yj-i-j then is the largest outdegree 
in Tn- The largest degree is w.h.p. Yj-^) + 1, since w.h.p. it is not attained at 
the root, e.g. because the root degree is Op(l) by Theorem 17. 10| this should 
be kept in mind when comparing with results in other papers.) 

Example 18.17. For uniform random labelled ordered rooted trees, we have 
by Example ED C ~ Ge(l/2) with tt^ = 2'''-'^ and thus P(^ ^ k) = 2'^. 
Hence Y(i^ has asymptotically the same distribution as the maximum of n 
i.i.d. geometrically distributed random variables, which is a simple and well- 
studied example, see e.g. Leadbetter, Lindgren and Rootzen |82]. Explicitly, 
Corollary 118.141 applies and (118.51(1 yields, uniformly in /c ^ 0, 



-fc-i 



)^A;) = e-"' +o(l). (18.62) 



^/c +o(l). (18.64) 



(This was, essentially, shown by Meir and Moon [87 

One way to express this is to introduce a random variable W with the 
Gumbel distribution 

P(W^ ^ x) = e"*^"", -oo<x<oo. (18.63) 

Then (jl8.62p yields, uniformly for A; € Z, 

r{Y(^i) ^k)= r{W <{k + l) log 2 - log n) + o(l) 

W + \ogn 
log 2 

In other words, extending c^k to Z-valued random variables, 

dK(y(i), L(^ + logn)/log2j) ^0. (18.65) 

Thus, the maximum degree Y(i) can be approximated (in distribution) by 
y{W + log n) I log 2j = \W I log 2 + log2 nj . Hence Y(X) — log2 n is tight but 
no asymptotic distribution exists; Y(i) — log2 n can be approximated by 
\Wl log 2 + log2 nJ — log2 n = \W j log 2 + {log2 n\\ — {log2 n\ (where we let 
{x} := X — \ x\ denote the fractional part of x), which shows convergence 
in distribution for any subsequence such that jl ogQ n| co nverges to some 
a G [0, 1], but the limit depends on a. See further [janson 60, in particular 
Lemma 4.1 and Example 4.3]. 
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In the same way we see that y(j) can be approximated in distribution by 
[Wj I log 2 + log2 n\ where Wj has the distribution 

P(W^j ^ x) = P(Po(e"'^) < j) = ~r^~'' -oo < x < cx), 

(18.66) 

with density function e~^^e~^ /(j — 1 ) ! ; fu rther Wj = — log Vj , where has 
the Gamma distribution Gamma(j, 1). (Cf. lLeadbetter. Lindgren and Rootzen 
[s^ . Section 2.2] for the relation between the distributions of and .^(i) in 
the i.i.d. case.) 

Example 18.18. For uniform random labelled unordered rooted trees, we 
have by Example 19.21 f ~ Po(l) with vrfc = e^^ lk\. We have Wk+i/wk — >• 0, 
so Theorem 118. 16l fii) appli es and sh ows that Y(^i^ is concentrated on at most 
two values, as proved by KolchinI 76 . Theorem 2.5.2]; see also Meir and 
Moon [sil] and Carr, Goh and Schmutz 21|. 

Explicitly, p8.51|) yields (treating the rather trivial case n > k^^"^ ■ k\ 
separately) 

P(y(l) <k) = e-"e-Vfc!(l+0(l/fc)) + = g-ne-i/fc! + ^(i) (^g^gy) 

which by Stirling's formula yields 

i)<k)= exp(-e^°g'^-(^+5)i°s'^+'=-i°g(^^)) + o(l) (18.68) 



uniformly in A; ^ 1, cf. Carr, Goh and Schmutz [2l|]. It follows easily from 
Stirling's formula, or from (jl8.68p . that ki{n), k2{n), (n) ~ logn/ log log n. 



and more precise asymptotics can be found too; cf. [90], [87[, [21]. 

In fact, the simple Example ll8.17l is typical for the case Wk+i/wk — )■ a > 
as k ^ oo; then Yj-^) always has asymptotically the same distribution as 
the maximum of i.i.d. geometric random variables, provided we adjust the 
number of these variables according to w. We state some versions of this in 
the next theorem. For simplicity we consider only the maximum 5^(1), and 
leave the extensions to Y(j) for general fixed j to the reader. 

Theorem 18.19. Suppose that wq > and that Wk+i/wk a as k ^ oo, 
with < a < oo. Suppose further that n — )• oo and m = m(n) with m = 
An + o{y/n) where < A < z/. 

Let q := TO = t/ p < 1. Let k{n) he any sequence such that T:k{n) = 
0(l/n); equivalently, k{n) = ki{n) + 0(1), and let N = N{n) he integers 
such that 

l-q Hr){l-q)' ^ ' 

(i) Let iji,. . . ,r]]^ be i.i.d. random variables with a geometric distribution 
Ge(l - q), i.e., W{r]i = k) = {I - q)q-^ , k^O. Then 

Yf I) ^ max r]i. (18.70) 
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(ii) Let W have the Gumbel distribution (118. 63p . Then 

~ [W/ log(l/g) + logi/, N\ . (18.71) 

(iii) Let bn := m^i^^n); thus bn = ©(I)- Then 

- k{n) ~ [{W + \og{bn/{l-q)))/\og{l/q)\ . (18.72) 

Thus Y(i) — k{n) is tight, and converges for every subsequence such that 
bn converges. 

Hence — k{n) converges for every subsequence such that bn converges, 
but the limit depends on the subsequence so Y(i) — k{n) does not have a 
lim it distrib ution. (For the distributions that appear as subsequence limits, 



see 



JansonI [60|, Examples 4.3 and 2.7].) Note that necessarily k{n) — >■ oo 



We show first a simple lemma, similar to Lemma |18. 51 

Lemma 18.20. Let Xn and X'^ be integer-valued random variables and 
suppose that there exists a sequence of integers k{n) such that Xn — k{n) is 
tight. (Equivalently: Xn = k{n)+Op{l).) Then the following are equivalent: 

(i) P(X„ ^ k{n) + P(X; ^ k{n) + £) ^ for each fixed I G Z; 

(ii) dK{Xn,X'^)^Q; 

(iii) Xn ~ X;, i.e., dTy{Xn,X'J 0. 



Proof. By considering Xn — k{n) and X'^ — k{n) we may assume that k{n) = 
0. Let e > 0. Since Xn is tight, there exists L such that P(|X„| > L) < e 
for every n. Suppose that (i) holds. Then 



oo 



(iTv(Xn,X)= V (P(X„ = £)-P(X; = £))_ 



^ ^ {¥{Xn = €)- P(X; = ^))+ + P(|X„| > L) ^ 0(1) + e. 

This shows (iii). The implications (iii) =^ (ii) and (ii) =^ (i) are trivial. 

□ 

Proof of Theorem \ 18. 191 By (118. 58j) . nk+i/iTk — >• g as — oo, and it follows 
from (118. 55j) that niTk^^n) ^ ""^ + o(l)]. It follows further that vr^j-^) = 

0(l/n) <;=^ k{n) = ki{n) + 0(1), as asserted, and then T^k{n)q~^'^'^^ ~ 
TTki{n)q~^^^"'^i thus we may replace k{n) by ki{n) in (118. 69|) . 
(i): For each fixed £ G Z, by (fT839]l . (flk58]l and (fTMD . 

nP(^ ^ A;(n) + £) = nnfc(„)+<, ~ n7rfc(„)+<,/(l - g) ~ n7rfc(^)gV(l - q) 
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furthermore, this is ©(1). Hence, (118. 5ip yields 

P(y(i) < k{n) + £)= e-"P«^^H+^) + o(l) = e-iVP(m^fc(n)+^) + ^(i) 

= (1-P(7?i ^ kin) + i))^ + o{l) 
= P(max?7j < k(n) + £) + o(l), 

and ()18.70p fohows by Lemma 118.201 since y{\) ~ k\{n) is tight by Theo- 
rem [T8TT6J 

(ii): As in (|18.64p . uniformly in k £ Z, 

P(niaxr/i < A;) = (l - q'')^ = e'^^' + o(l) 

= F{W < klog{l/q) - log Af) + o(l) 
W + logN 



Hence, dTv(niaxjsg7v "^ii [Ty/log(l/g) +logi/qA^J) ^ 0, and (fTSTTD follows 
from (|18.7Up and Lemma [iklOl 

(iii): By (fTOTj) . log!/ iV = /c(n) + log yg(6„/ (l " <?)) + o(l), and (fT8J2]) 
follows easily from ()18.7ip , using Lemma 118.201 and the fact that W is ab- 
solutely continuous. □ 

Remark 18.21. For later use we note that Theorem [THTTOl as other results, 
extends to the case = by the argument in Remark 110.81 we now have 
to assume A > a := min{A; : Wk > 0}. The extension of Theorem I18.19( i) is 
perhaps more subtle that other applications of this argument since N will 
change by a factor ~ q°', but (ii) and (iii) are straightforward, and then (i) 
follows by P8.74I) and Lemma [l82Ql 

If the weight sequence is very irregular, y(i) can fail to be concentrated 
even in the case X < u. 

Example 18.22. Let £j := 2'^' and S := Let Wk = l/k^ ii k e H, 

= if ^ 3 and fc ^ S, and choose wq > 0, wi > and W2 such that 
(wk) is a probability weight sequence with /i := X^fclo ^^fc = 1- Then p = 1 
and, by (I3.11|) . = oo. Choose m = n — 1 (the tree case); thus X = 1 < u. 
Note that Ij+i = e.. Un = ij, then P(^ ^ ij) ~ = n~'^, P(C ^ 

ij^i) ~ = n~\ and P(C ^ £j-2) ~ l/£j-2 = and it follows 

from (|18.5ip that for n in the subsequence S, P(y(i) < ij) 1, '^(^{1) < 
ij-i) — 5- e^^ and P(i^(i) < ^j-2) — ^ 0. Hence, along this subsequence, 
= ni/2) ^ 1 _ and P(y(i) = n^^) ^ 



18.4. The subcase Wk+i/wk — )• as A; — )• 00. We have seen in Theo- 
rem [T8T6] that when Wk+i/wk — )• as A; — )• 00, the maximum Yj^i) is asymp- 
totically concentrated at one or two values. We shall see that for "most" 
(in a sense specified below) values of n, Y(i) is concentrated at one value, 
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but there are also rather large transition regions where Y(i) takes two values 
with rather large probabilities. 

We have, as said before Theorem 1 18 .161 uj = oo and p = oo. Furthermore, 
by Lemma 13. ]|fv)[ u = oo. 

We define 

nfc:=Ll/vrfcJ, (18.75) 

noting that rifc+i/n^ ~ T^k/T^k+i — 5- oo as /c — >• oo; in particular, n^+i > n^, 
(for large k, at least). The results above then can be stated as follows. 

Theorem 18.23. Suppose that wq > and that Wk+i/wk — ?• as k ^ oo. 

Suppose further that n — )• oo and m = m(n) with m = \n + o{y/n) where 
< A < oo. 

(i) Consider n in a subsequence such that for some k{n) and some x E 
(0,oo), n/nj^(^n) ~^ X- Then 

P(y(i) = k{n) - 1) ^ e-^ 
P(y(i) = k{n)) ^ 1 



(ii) Let Jlfc — )• oo as k ^ oo. If n ^ oo with n ^ IJfcLif^fe "^"-fci ^fc'^fc]; 
then, for k{n) such that n^,(„) < n < n^.(„)_|_x, 

P(y(i) = k{n)) ^ 1. 

Proof, (i): Along the subsequence, using ()18.54p . (118. 59|) and (|18.75p . 

nP(C ^ A;(n)) = nli^n) ~ "'7rfc(„) > x. (18.76) 

^fc(n) 

Hence, (fT83T]l yields P(y(i) ^ A:(n) - 1) ^ e"^. Furthermore, by (fTKTH]) 
and (I18.6ip . nP(^ > k{n)) and nP(^ ^ A;(n) - 1) oo; hence (118. 51D 
yields P(y(i) ^ A;(n)) 1 and P(y(i) ^ k{n) - 2) ^ 0. 

(ii): We may assume fi^ > 1. Then the assumptions imply ^k{n)^k{n) < 
n < il.^^^-^^^nk(n)+i, where k{n) — ?■ oo and thus ^k{n) — >• oo as n — >• oo. 
Hence, similarly to (|18.76p . 

Tl 

nP(^ ^ k{n)) > Ofc(n) oo, 

nP(e ^ Mn) + 1) ~ < ^ 0, 

''A:(n)+1 

and the result follows by (jl8.5ip . □ 

Roughly speaking, the values of n such that Yf^i-j takes two values with 
significant probabilities thus form intervals around each rik, of the same 
length on a logarithmic scale; between these intervals, y(i) is concentrated 
at one value. 
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Example 18.24. Consider again uniform random labelled unordered rooted 
trees, as in Example 118.181 We have = [/c!/ej. In this case, it is simpler 
to redefine Uk ■= k\; Theorem I18.23r ii) is unaffected but (i) is modified to 

P(y(i) = k{n) - 1) ^ e-^/^ (18.77) 

P(y(i) = k{n)) ^ 1 - e-^/^ (18.78) 

Cf. Carr, Goh and Schmutz [2ll |. 



Remark 18.25. We have for simplicity considered only the maximum value 
y(x) ™ Theorem 118.231 It is easily seen, by minor modifications in the 
proof, that for any fixed j, in (ii) also Y"(j) = k{n) w.h.p., while in (i) 
y(j) G {k{n) — l, k{n)} w.h.p., but the two probabilities have limits depending 
on j; in fact, the number of j such that Y(_,) = k{n) converges in distribution 
to Po(x). We omit the details. 

To make the statement about "most" n precise, recall that the upper 
and lower densities of a set ^ C N are defined as limsup„_^oQ a (n)/n and 
limmin^oo CL{n)/n, where a(n) := \{i ^ n : i G A}\; if they coincide, i.e., 
if the limit liuin^oo o-in) / n exists, it is called the density. Similarly, the 
the logarithmic density of A is lim„_s.oo X^i^n ieA I' "^hen this limit ex- 
ists, with upper and lower logarithmic densities defined using limsup and 
lim inf. It is easily seen that if a set has a density, then it also has a logarith- 
mic density, and the two densities coincide. (The converse does not hold.) 
Furthermore, define 

:=maxP(y(i) =A:). 

It follows from Theorem 118.161 that the second largest probability P(^(i) = 
A;) is 1 — p* + o(l). Thus, for n in a subsequence, Y(i) is asymptotically 
concentrated at one value if and only if p* — )• 1; if stays away from 1, 
y(i) takes two values with large probabilities. 

Theorem 18.26. Suppose that wq > and that w^^i/wk — s- as k ^ oo. 

Suppose further that n — )• oo and m = m(n) with m = Xn + o{^/n) where 
< A < oo. 

(i) If ^ < a < 1, then the set {n : p* < a} has upper density 
log I log > and lower density 0. 

(ii) There exists a subsequence of n with upper density 1 and logarithmic 
density 1 such that p* — )• 1 . 

Note that the upper density in (i) can be made arbitrarily close to 1 by 



taking a close to 1. This was observed by Carr, Goh and Schmutz 2l[ for 
the case in Example 118.241 (However, they failed to remark that the lower 
density nevertheless is 0.) 

Proof, (i): Let hi := — logo and 62 := — log(l — a); thus < 61 < 62 < 
00. Then max(e~^, 1 — e~^) < a <^=^ x £ (^1,^2)) s^nd it follows from 
Theorem 118.231 (and a uniformity in x implicit in the proof) that for any 
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e > 0, if n G Ufc[(^i + (^2 — £)n-k], then p'^ < a for large n, while if 
n ^ UfcK^i ~ ^)n-k, (&2 + £)nk], then p*n> a for large n. Since Uk+i/rik — )• 
as A; — )• 00, it is easily seen that for any b[,b2 with < 6'^ < 62 < 00, 
Ufei^'i^fe) ^2^^^] '^^s upper density (62 — &i)/&2 and lower density 0; it follows 
by taking b'j := 6j ± e and letting e — )• that the set {n : p* < a} has upper 
density (61 — 62)/^2 and lower density 0. 

(ii): Let 0,^ be an increasing sequence with fijt ^ 00 so slowly that 
logOfc = o(log(nfc/nfc_i)). Let A := Ufc[^^fc ^'^fc, ^^fc'^fc]- By TheoremllS^ii) , 
p* — >• 1 as n — )• 00 with n ^ A, so it suffices to prove that A has lower density 
and logarithmic density 0. 

It is easily seen that for the upper logarithmic density of A, it suffices to 
consider n S {[OfcnfcJ}, which gives 

^^=1^2^^!^ E,ti(21og^7, + 0(l)) 

lim sup — J— r ^ lim sup — ^— r > U. 

fc^oo log(S2fcnfe) fe^oo Ej=ilogK/"'i-i) 

Hence the logarithmic density exists and is 0. 

The lower density is at most, considering the subsequence [il^^n^J, 

T ■ rO-i^k^'^k) , ,. . ^ f^fc-infc-l ^k~l^k r, 

limmi — j ^ hmmf — j = lim = U, 

since ^k-i ^ ^k < (nk/nk-i)^^^ for large k. (Alternatively, it is a general 
fact that the lower density is at most the (lower) logarithmic density, for 
any set ^ C N.) □ 

18.5. The subcase \ = v and o"^ = 00. We give two examples of the case 
A = z/ and cr^ = 00. (In both examples, we may assume that = 1 and 
m = n — 1, so the examples apply to simply generated random trees.) The 
first example shows that Theorem 1 1 8 . 71 does not always hold if = 00; the 
second shows that it sometimes does. 

Example 18.27. Let 1 < a < 2 and let (wk) be a probability weight 
sequence with wq > and Wk ~ cA;~"~^ as /c — )• 00, for some c > 0. (This 
is as in Example 1 1 1 . 1 1 wit h /3 = a + 1 S (2, 3). If (wfc) is not a probability 
weight sequence, we may replace c by c' := c/<I>(l).) We have p = 1, and 
thus v = ^'(1) = '^kwk < 00. (We may obtain any desired z/ > 0, for 
example = 1, by adjusting the first few Wk-) 

We consider the case m = vn + 0{1); thus m/n ^ X = v. (This includes 
the tree case m = n — 1 in the case z/ = 1. Actually, it suffices to assume 
m = vn + o(n^/°).) Then t = 1 = p, and VTfc = Wk- 

The random variable ^ thus satisfies E ^ = A = i/. Note that cx^ := Var ^ = 
00. (This is the main reason for taking 1 < a < 2; if we take a > 2, then 
cr^ < 00 and Theorem 118.71 applies.) Furthermore, 



P(^ ^k) = ^wi^ ca-^k-'^. (18.79) 



l=k 



SIMPLY GENERATED TREES AND RANDOM ALLOCATIONS 



107 



As in the proof of Theorem 117.141 there exists by [33, Section XVII. 5] a 
stable random variable (satisfying ()18.93p and ()18.113p ) such that 



n 



l/a 



moreover, by [4f|, § 50], the local limit law (117. 22p holds uniformly for all 
integers ^. Note that the density function g is bounded and uniformly con- 
tinuous on M, and that g{{)) > by (|17.24p . (In fact, g{x) > for all x. See 
also [3^, Section XVII. 6] for an explicit formula for g as a power series; 
is, after rescaling, the extreme case 7 = 2 — a, in the notation there.) 
By (flATiD and HTm . 

p.^ _ , , _ WkFjSn-i = m - k) _ g{-k/n^l^) + o(l) 
!r\il — K) — 7 — Wjg- 



'n = m) g{0) + 0(1) 

g(-fc/nV") + 0(1) 

"^'^ m ' ^ ^ 



uniformly in A; ^ 0. 

For a non-negative function / on [0,oo), define 



Xf:=^f{Y,/n'/n- (18.82) 



1=1 



In particular, if / is the indicator l{a ^ x ^ 6} of an interval [a, b], we write 



Xn^ and have in the notation of p8.7|) 

X?,'" :=\{i^n: an'/" ^Y^^ bn'/'^jl = iV[,,i/.,b„i/.]. (18.83) 

Suppose that / is either the indicator of a compact interval [a,b] C (0,oo), 
or a continuous function with compact support in (0, 00) (or, more gener- 
ally, any Riemann integrable function with support in a compact interval in 
(0,00)). Then, using ()18.8ip and dominated convergence. 



E X/ = n f; /(fe/nV") F(y, = k)=nf2 fik/ny^w.'-thM^^lim 



10 





/(x)cx-"-^ dx. (18.84) 



nXi%-.c' ■■■ n^r"' rm" dxi • • • dx,. (18.85) 
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In the special case when f{x) = l{a ^ x ^ 6} with < a < 6 < oo, we 
further similarly obtain, 

¥.X-^\X'^^-l)=n{n-l) f{k/n'hfij/n'hnYi=k, = j) 
= n(n-l) Y: mn'hfij/n'hn^,w, ^^^-': 

Vo Jo 5(0) 

A 5(0) 

and, more generally, for any £ ^ 1, 

^aM^ J ( f TJ ^-a-i9{-xi xe) 

.a 5(0) 

For each such interval [a,b], this integral is bounded by CR^ for all i ^ 1, 
for some C and R (depending on a and b), and it follows by the method 

of moments that Xn^ — ^ X^, where X^^ is determined by its factorial 
moments 

b b ^ 

E{X^J)e = c' [ ■■■ [ n^^^"''^^^^SHf^ • • • (18-86) 

Ja Ja 9v^) 

(It follows that X^ has a finite moment generating function, so the method 
of moment applies.) Furthermore, joint convergence for several intervals 
holds by the same argument. It follows also (by some modifications or by 

approximation with step functions; we omit the details) that Xn xl^ 
for every continuous / ^ with compact support and some Xio- 

Let S„ be the multiset {Yi/in}/"' : > 0}, regarded as a point process 
on (0, oo). (I.e., formally we let H„ be the discrete measure '^j- Y.^n ^vj'n ^l^- 
See e.g. Kallenberg [g^ or [g^ for details on point processes, or ijanson 57, 
§ 4] for a brief summary.) The convergence Xn — > Xio for every continuous 



68, 



/ ^ with compact support in (0, oo) implies, see [63, Lemma 5.1] or 
Lemma 16.15 and Theorem 16.16], that H„ converges in distribution, as a 
point process on (0, oo), to some point process H on (0, oo). The distribution 
of H is determined by (jl8.86p . where X'^ is the number of points of H in 
[a, 6]. By (118. 86p or (118. 84p . the intensity measure is given by 

E H = cc/(0)"^x"""ic/(-x) dx. (18.87) 
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We can also consider infinite intervals. Let a > 0. Then, using again 
(pHl) and noting that T.kL-oo ^{Sn-i = m-k) = l, 

l/a,-a-l Ek^^an^/'^ ^{S^-l = m - k) 



= m) 

1 



^ nCi{an 

^ C2a-''-\ (18.88) 

By Fatou's lemma, (fTSlSSjl implies E ^ C2a"""^ < oo. Hence, XSf° < 

oo a.s. for every a > 0, and we may order the points in S in decreasing order 
as 

E = {7?j}/=i with m>V2>---- (18.89) 

(Here J = X^°^ ^ oo is the random number of points in H. We shall see 
that J = oo a.s.) 

The bound ()18.88p is uniform in n, and tends to as a — t- oo. It follows, 
see [s^, Lemma 4.1], that if we regard H„ and S as point processes on [0, oo], 

the convergence En E on (0, oo) implies the stronger result 

En^E on[0,oo]. (18.90) 

The points in H„, ordered in decreasing order, are Y(i)/n^/'^ ^ ^(2)/'^^^" ^ 
. . . . If we extend (|18.89p by defining r/j := when j > J, the convergence 



(jl8.90p of point processes on [0, 00] is by [57|, Lemma 4.4] equivalent to joint 
convergence of the ranked points, i.e. 

y(,)/ni/- A r/„ J ^ 1 (jointly). (18.91) 

We claim that each r]j > a.s., and thus J = = 00 a.s. Suppose 

the opposite: P(r/j = 0) = (5 > for some j. Then, for every e > 0, 
liminf P(Y(j)/n^/'^ < e) ^ F{r]j < e) ^ 5, and it follows that there exists 
a sequence e„ — )• such that ¥{Y(^j^/n^/°^ < e„) ^ 6/2 for all n. We may 
assume that e„n^/" — )• 00. Let j4 > and take (for large n) a„ := e„ and 
6„ := (£-" - ac-M)"^/". Then an,6„ ^ 0. For A; ^ bnU^^'' = 0(71^/°), 
(|17.2'2p implies P(5„_i = m — k)/F{Sn = m) — )• 1, and the argument in 
([I831])-([IH35]) yields, for each £ ^ 1, 

E(X^-''")^~ n Wk\ ^(c x-'^-'dx) 



ca 



"'(ar-^'r))' = ^'- (18.92) 
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Hence, Xn"'^" — > Po{A); in particular, 

6/2 ^ P(y(j-)/ni/" < e„) ^ P(X^"'^" < j) ^ P(Po(^) < j). 

Taking A large enough, we can make P(Po(yl) < j) < 5/2, a contradiction 
which proves our claim. 

We have shown that (118. 9ip holds with rjj > 0. Furthermore, since the 
intensity (jl8.87p is absolutely continuous, each r/j has an absolutely contin- 
uous distribution. Hence and every Yq^, is of the order n~^/", with a 
continuous limit distribution rjj (and thus no strict concentration at some 
constant times n~^/"). 

Note that if we consider i.i.d. variables ^i, . . . , then {^i/n-^/" : > 0} 
converges (as is easily verified) to a Poisson process on [0, oo] with intensity 
cx~"~^ dx. This intensity differs from the intensity of H in p8.87|) . and, 
since g{—x) — as x — )• oo, it is easy to see that S,(i)/n^^" and y(i)/n"'^/" 
have different limit distributions. Thus, Theorem 118.71 does not hold in this 
case. (However, Y(^i^ and .^(i) are of the same order n~^/".) Note also that, as 
an easy consequence of (|18.86p . the limiting point process H in this example 
is not a Poisson process. 

Remark 18.28. The distribution of the limiting point process H in Ex- 
ample [18]27] is in principle determined by (118. 86p and its extension to joint 
convergence for several X^' \ This can be made more explicit as follows. 



(See Luczak and Pittel [8; 



for similar calculations.) 
It follows from iFellerl (39|, Section XVII. 5], see e.g. [g^] for detailed calcu- 



lations, that Xq, has the characteristic function 

V9(t) = exp(cr(-a)(-it)"), t € M. (18.93) 

(Note that T{-a) > and Re(-it)" < for t / since 1 < a < 2.) The 
inversion formula gives 

g(x) = — e-'^V(i) dt = — ^-ixt+cri-a){^itr (13,94) 

J — 00 t/ — 00 

and (I18.86P yields 

-1 pb pb ^ poo ^ 

^<^» = 2^/ L"'L n ^r-' n ^-^v} d« 



In particular, ]K{X^^)£ = 0{C^) for some C < 00 (with C depending on a 



but not on b). Hence, X^'' has probability generating function, convergent 
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for all complex z 



= — ^—-[ exp((z-l)c[ x~"-^e'*^dxV(*)dt. (18.96) 
2715(0) 7_oo ^ Ja ' 

We can here let h — )• c«, so ()18.96p holds for 6 = oo too. In particular, taking 
z = 0, we obtain, using (jl8.9ip . the limit distribution of Y^y^jn^l'^ as 

P(ryi ^ x) = P(X^~ = 0) 



= / ejcp(-c x-"-ie^*^dx')(^(t)dt 

= / expf-c / x-"-ie"^dx + cr(-Q)(-it)") dt 

(18.97) 

where the last equality holds because 

r(-a)n" = / x""-^ (e""^ -1 + ux) du (18.98) 

when Ren ^ and 1 < a < 2. 

Furthermore, by extending (jl8.86p to joint factorial moments for several 
(disjoint) intervals, it follows similarly, for step functions /, that the random 
variable x[o = Z^^i fiVj) satisfies 

f 1 f'°° / \ 

Ee^- = — / expfc / (e^(^') - l)x-"-ie"^' dx ) (/p(t) di 

27rff(0) ^ Jo ' 

= / expfc/ (e^(^)-l)x-"-ie"^dx + cr(-Q)(-it)°')dt 

2vrg(0) ^ Jo ' 

1 /* OO OO 

(18.99) 

By taking limits, ()18.99p extends to, e.g., any bounded measurable / with 
compact support in (0, oo]. Since Ee^'''"°° = '&e'^°° for s E M, this formula 
determines (in principle) the distribution of each xto and thus of H. 

Example 18.29. Let {w\^ be as in Example 118.271 but with a = 2, i.e., 
W}^ ~ ck^'^ as A; — )• cxd, for some c > 0. (Example II 1.101 with /3 = 3.) We still 
have (118. 79p : further, p = 1, and thus v = ^'(1) = ^kwk < oo. (We may 
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again obtain any desired u > 0, for example u = 1, by adjusting the first 
few Wk-) 

As in Example 1 18. 2 7[ we consider the case m = vn + 0(1), including the 
tree case m = n — 1 when 1^ = 1. Thus, again, m/n -^X = iy,T = l = p, 
TTfc = Wk, and the random variable ^ satisfies = A = i^, while cj^ := 
Var ^ = oo. 

As in the proof of Theorem 117.141 we have the central limit theorem 
(|17.25p . and the local limit law (I17.26P holds uniformly for all integers i. 
Choose B{n) := n^/^ log log n = o{^/nTogn). Then, by (jl7.26p . 

Z{m, n) = F{Sn = m) = ^ ""^^ ^ (18.100) 

V n log n 

and, uniformly for all k ^ B{n), 

Zim - A;, n - 1) = P(5„_i = m-k) = ^^" ^ (18.101) 

V n log n 

Hence, by (|18.14p . (|18.22p holds. Furthermore, (|17.26p yields also, since 
g(0) = max^.giR5(a;), 

Zim - fc, n - 1) = P(5„_i =m-k)^ ^(0)^^^ (18.102) 

V n log n 

uniformly for ah A; ^ 0; hence (|18.14p implies that (|18.17p - ()18.18p hold. 
For our B{n) we have by (jl8.79p 

^ B(n)) = 0{B(nY'^) = o{n-^), (18.103) 



so (fimi) holds, and thus (fTM|) holds. 

The proof of Theorem ll8.7l now holds without further modifications; hence 
the conclusions of Theorem 118.71 holds for this example, although cr^ = oo. 

Note that in Example 1 18. 2 71 although the asymptotic distributions of Yj^i) 
and ^(x) different, they are still of the same order of magnitude. We do 
not know whether this is true in general. This question can be formulated 
more precisely as follows. 

Problem 18.30. In the case X = u, do Theorem 18. 'I^ii) - (iii) hold also 
when o"^ = oo ? 

18.6. The case A > i^. We turn to the case X > v. Then, as briefly 
discussed in Section [TOl the asymptotic formula for the numbers in 
Theorem 110.41 accounts only for X^fc^o ^'^fc'^ = l^-n = balls, so there 
are m — vn ^ ~ ^)'^ balls missing. A more careful treatment of the 
limits show that the explanation is that Theorem 110.41 really implies that 
the "small" boxes (i.e., those with rather few balls) have a total of about 
X^fcLo ^^fc^ = l^n = vn balls, while the remaining (A — v)n balls are in a 
few "large" boxes. One way to express this precisely is the following simple 
result. 
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Lemma 18.31. Let w = {wk)k^o o, weight sequence with wq > and 
u = oo. Suppose that n — )• oo and m = m{n) with m/n ^ X where v < \ < 
oo. 

(i) For any sequence — > oo, 

kNk vn + Op{n) and kNk ^ (A — i^)n + Op(n). 

ks^Kn k>Kn 

(ii) There exists a sequence 0„ — )• oo such that for any sequence Kn — ?• oo 
with Kn ^ l^n we have 

kNk = vn + Op{n) and fcA^jt = (A — i^)n + Op(n). 

Proof. The two statements in each part are equivalent, since 

oo 

^kNk = m = \n + o{n). (18.104) 

fc=0 

(i): For every fixed Theorem 110.41 imphes 

-^kNk -^^k-Kk- (18.105) 

Let e > 0. Since YlT=o ^'^k = v < oo, there exists £ such that Ylk<i ^'^k > 
v — £, and (jlS.lOSp imphes that w.h.p. 



- kNk > I' - e. 



n 

k<i 



Since e is arbitrary, this imphes X^fc^^-^ kNk vn + Op{n). 

(ii): For each fixed £, YlkKi^'^k < YlT=o^'^k = ^, and thus ()18.105p 
imphes F(X]fc^£ kNk > vn) — t- 0. Hence, there exists an increasing sequence 
of integers such that if n ^ n^, then IP(X^fc<^ ^-^fc > ^'n) < Now 
define 0„ = £ for ^ n < n^+i. Then Ylk<n ^^k ^ J^'i- w.h.p., which 
together with (i) yields (ii). □ 

Consider the "large" boxes. One obvious possibility is that there is a 
single "giant" box with (A — z^)n balls; more formally, (A — i')n + Op(n) 
balls (a "monopoly"). Applying Lemma llS.SlT i) with Kn = o(n), we see 
that for every e > 0, w.h.p. there are then less than en balls in all other 
boxes with more than Kn balls each; thus, either Y(^2) ^ or Y(^2) < ^it-- 
Consequently, this case is defined by 

= (A - iy)n + Op(n), (18.106) 
y(2) = Op(n). (18.107) 

Equivalently, Yf^i^/n X — ly and 1^(2)/"- — ^ 0. This thus describes con- 
densation of the missing balls to a single box. 
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We will see in Theorem 118.331 that, indeed, this is the case for the impor- 
tant example of weights with a power-law. Another, more extreme example 
is Example 19.81 Wk = kl, where = 0, see Example 118.351 

However, if (wk) is very irregular, ()18.106p - ()18.107p do not always hold. 
Examples 118.361 and 118.371 give examples where, at least for a subsequence, 
either Y(2)/^ ~^ a > 0, so there are at least two giant boxes with order n 
balls each (an "oligopoly"), or Yf^i^/n — ^ 0, so there is no giant box with 
order n balls, and the missing (A — i^)n balls are distributed over a large 
number (necessarily — )• oo as n — )• oo) of boxes, each with a large but o(n) 
number of balls. 

Example 18.32. We consider Example Ill.lOt Wk ~ ck~^ as k ^ oo. If 
/3 ^ 2, then ly = oo, see (lll.40p . and thus A < and Theorems 118.31 and 118. 71 
apply. We are interested in the case A > u, so we assume /3 > 2. In this 
case, Jonsson and Stefansson [67|] showed (for the case of random trees) that 
when A > z^ we have the simple situation with condensation to a single giant 
box. We state this in the next theorem, which also includes further, more 
precise, results. (Note that the case X < u is covered by Theorems 118.31 
and 118.7] with y(i) of order logn; the case A = is studied in Examples 
118.271 and [18:29] for 2 < /3 ^ 3, and is covered by Theorem [iMl when /3 > 3; 
in both cases Yj^i) is of order n^^/(l^^^'> = o{n).) 

Theorem 18.33. Suppose that ~ ck~^ as k ^ oo for some c > and 
j3 > 2. Then v < oo. Suppose further m/n \ > u. Let q := /3 — 1 > 1 
and d := c/<I>(l). 

(i) The random allocation Bm,n = (^i, • • • ^^n) has largest components 

= (A-z^)n + Op(n), (18.108) 
Y(2) = Op(n). (18.109) 

(ii) The partition function is asymptotically given by 

Z{m, n) ~ c(A - iy)-^<l>{l)''~^n^-f^ . (18.110) 

(iii) Furthermore, 

n-l 

(y(i), y(2) , . . . , ~ (m - ^ , (18.111) 

i=l 

where S^'^^y . . . are the n — 1 i.i.d. random variables ^i, . . . ,^n-i, 

with distribution (vTfc), ordered in decreasing order. 

(iv) Y^i^ = m — un + Op(n^/°) and 

n-^^"{m -vn- ^(i)) A X„, (18.112) 
where is an a-stable random variable with Laplace transform 

Ee"*^" = exp(c'r(-a)r), Ret^O. (18.113) 
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(v) y(2) = Op(ni/") and 

n-V°y^2) A W, (18.114) 
where W has the Frechet distribution 

P(VF ^ x) = exp(^-^x"°) , X ^ 0. (18.115) 

(vi) More generally, for each j ^ 2, y^^) = Op(n^/") and 

n-i/"y(j) A VFj, (18.116) 
where Wj has the density function 



-2 



C X 



exp(-c'a-^x-°), x^O, (18.117) 



(j-2)! 

and c'a-iVF-"~r(j- 1,1). 

Note that vr^ = t(JA;/$(l) and that T{-a) > in (|18.113p . 

Part (hi) shows that Y(2) ) • • • ) ^(n) asymptotically are as order statistics of 
n — 1 i.i.d. random variables ^j; thus the giant box absorbs the dependency 
between the variables Yi, . . . , y„ introduced by the conditioning in (110. 7p . 

Remark 18.34. Jonsson and Stefansson [gt'] considered only trees, and thus 
m = n — 1 and A = 1 , and then showed the tree versions of (i) and (ii) . (They 
further showed Theorem 17.11 when ~ ck~^.) In the tree case (i) says that 
the random tree Tn has w.h.p. a node of largest degree (1 — z^)n + o(n), while 
all other nodes have degrees o(n); further, by Theorem 114.51 (ii) becomes 

Zn ~ c(l - u)-'^<^{l)''-^n-l^ ~ (1 - i/)-^^'(l)"-i'u;„. (18.118) 

Proof of Theorem \18.33[ We may assume that t^o > by the argument in 
Remark 110.81 Furthermore, using (jl0.9p for (ii), by dividing (and c) by 
$(1), we may assume that (wk) is a probability weight sequence, and thus 
$(1) = 1. For A > we have t = p = 1, and thus then vr^ = w^. 

(i): ^{t) has radius of convergence p = 1, and since f3 > 2, ^*(1) = 
^i^Wk <oo and u = $'(1)/^>(1) < oo. 

Consider as in Example 110.21 i.i.d. random variables ^i, . . . , ^„ with dis- 
tribution (vTfc) = (wk) and mean p = u. 

Fix a small e > 0. We assume that e < \ — v. 

By the law of large numbers, Sn-i/n — ^ p = v. We may thus find a 
sequence (5„ — )■ such that \Sn-i — nv\ ^ n5„ w.h.p. 

Since m/n — — (5„ — t- A — > e, we have m — vn — 6nn > en for large n; 
we consider only such n. 

We separate the event Sn = rn into four disjoint cases (subevents): 



Exactly one > en, and that satisfies — {m — i/n)\ ^ 5„n. 

Exactly one > en, and that satisfies — (m — un)\ > 5nn. 

> en for at least two i G {1, . . . , n}. 
AU ^ en. 
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We shall show that £i is the dominating event. We define also the events 





Sn 


= m, 


\Ci - (m - 


vn) ^ and ^ en for j ^ i 


c* 
^li 


Sn 


= m, 


- (m - 


iyn)\ ^ 5nn. 


c* 


Sn 


= m, 


-{m- 


vn)\ > dnn, > en. 




Sn 


= m, 


ii > en, 


> en. 



Then £i is the disjoint union IJILi ^i*' symmetry 

P(£:i) = nP(^ii). (18.119) 

Furthermore, for any i, 

C U (J 

and thus, again using symmetry, 

¥{8l^) ^ P(£:n) ^ P(fi*i) - nP(Pi2). (18.120) 

Using the fact that \k — {m — vn)\ ^ bnn implies ~ ck~^ ~ c{Xn — un)~^, 
together with |S'„_i — nz/| ^ 5„n w.h.p., we obtain 

\k—{m—i'n) I ^<5rjn 

^ nCi = k)FiSn-i = m-k) 

\k—{m—i'n) I ^<5„n 

^ c{X-u)-'^n->^{l + o{l))¥{Sn-i=m-k) 

\k—{Tn—un)\i^5nn 

= c{X-iy)-^n-'^F{\Sn-i-niy\ ^ (5„n) (l + o(l)) 
= c(A-i.)-^n-^(l + o(l)). (18.121) 
Similarly, allowing the constants Cj here and below to depend on e, 

n£2i) = E = fc, 5„ = m) 

\k—{m—i'n)\>Snn, k>en 

^ Ci{en)-^ ¥{Sn-i = m-k) 

\k—{rn—vn)\>5n.'n, k>en 

^ C2n-'^ F{\Sn-i - yn\ > 5nn) = o{n-^) . (18.122) 
For any i and j, by symmetry, 

F{Vij) = P(e„ > en, Cn-i > sn, Sn = m) 

= H^n = k) nSn-i = m-k, e„-i > en) 

k>en 

^ Cs{en)-^ Y HSn-i =m-k, > en) 

k>en 

^ Cz{en)-^ P(Cn-i > en) ^ C^^enf-'^^ . (18.123) 
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Hence, (I18.12()p and (I18.12ip yield 

F{£u) = c(A - u)-'^n-'^ + o(n-'^) + ©(n^-^/^) = c(A - i/)-^n-^ + o{n-'^) 
and hence, by (|18.119p . 

P(£:i) = c(A - uy^n^-l^ + oin^'l^) . (18.124) 
Furthermore, (I18.122P yields 

n 

¥{£2) ^J2^{S^i^ = nnS*,,) = o{n^-^), (18.125) 

i=l 

and (118.1231) also yields 

^£3) ^ l^P(Ai) ^ n2p(Pi2) = 0{n^-^^) = o{n^-^). (18.126) 

It remains to estimate F^S^). We define the truncated variables := 
^ and 5„ := Y17=i^i- Thus <S4 C {5„ = m} and hence, for 
every real s, 

P(f4) ^ e-'^'Ee"'^ = e""™ fEe'^'^)" . (18.127) 



Let s := a log n/n, for a constant a > chosen later. Then, 

en 

Ee'^' = 1 + sE^i + "^Trk{e''' - 1 - sk) 

k=l 

2/3/s en 

^ 1 + si^ + C5 ^ k-l^s^k^ + C5 k-^e'^. (18.128) 

A:=l k=2P/s 

We have, treating the cases 2</3<3, /3 = 3 and /3 > 3 separately, using 
s ^ 0, 

^ s2^2-/3 ^ ^^^2 J^^^ (2/3/s)3-/^, log(2/3/s)) = o(s). 

A:=l 

Furthermore, for k > 2f3/s, 

k-^e'^ _ / l\P 
{k + l)-/3e^(^+i) ~ ^ fc. 

Hence, the final sum in ()18.128p is dominated by a geometric series 

fc^[enj 

If we assume ae ^ /3 — 2, the sum is thus ^ Cgn^'^^"'^ ^ Cgn'^ = o(s) 
Consequently, (|18.128p yields 

Ee'^'^i ^ 1 + SI/ + o(s) ^ exp(sz/ + o(s)) 



2 



n 
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and thus (|18. 12711 yields 

P(iS4) ^ exp(— sm + nsz^ + o(ns)) = exp(— ns(A — i/ + o(l))) 

(18.129) 

We choose first a := /3/(A - and then e < (/3 - 2)/a, and see by (|18.129p 
that then ¥{£4) = n-'^+°(i) = o{n^-^). Combining p8.124p . p8.125ll . 
P8.126I) and p8.129|) . we find 

¥{Sn = m) = ¥{£i) + o(ni-^) = c(A - z^)^''ni-'' + o{n^-^) , (18.130) 

and, in particular, P(£^i | 5^ = m) — )• 1. Consequently, by conditioning on 
Sn = m we see that w.h.p. \ Y(^i) — {m — un)\ ^ (5„n and Y(2) ^ ^'^^ Since e can 
be chosen arbitrarily small, this completes the proof of (|18.108p - (|18.109p . 

(ii) : Z{m,n) = P(S'„ = m), so (|18.11UD follows from (118.1301) . since we 
assume <I>(1) = 1. 

(iii) : Since £1 C {Sn = m} and ^(£"1 | = m) — )• 1, 

(yi,...,y„) i ((6,...,Cn) I 5„ = m) ~ ((ei,...,Cn) l^^i). (18.131) 

When we consider the ordered variables ^(i), . . . , ^(n)) may by symmetry 
condition on £in instead of £1. Note that £in is the event (^1, . . . ,^n) S A, 
where A is the set 

n—l n— 1 

|(xi, . . . , Xn) '■ Xj ^ en for j ^ n — 1, x„ = m— Xj, Xi — vn ^ ^n'^j- 

1=1 i=l 

Since (xi, . . . ,x„) G A implies — (m — i^n)| ^ 5„n, we then have, similarly 
to (|18.121D . 

P(^„ = x„) ~ cxn^ ~ c(m - vn)~^ ~ c(A - z^)"^n"'^. 

Furthermore, determine x„ by Xj = m. It follows that, 

uniformly for all (xi, . . . , Xn) G vl, 

IP((6, •••,?«) = (xi,...,x„)) 

= (l + o(l))c(A-z.)-^n-'^P((ei,...,C„-i) = (xi,...,x„_i)) 

= (l + o(l))c(A-z.)-^n-'^P((ei,...,C„-i,m-5„_i) = (xi, . . . , x„)) . 

Hence, since the factor c(A — v)^^n~^ is a constant for each n, 

((6, •••,?«) I £ln) ^ ((6,---,Cn-i,m-5„_i) I (18.132) 
where £n is the event 

{(6, • • . ,^n-i,"i.-S'„_i) e A] = [^j ^ en for j ^ n - 1, |S'„_i-i/n| < (5„n}. 

(18.133) 

If £n holds, then m — Sn^i ^ m — vn — 5nn > en (for large n), so the largest 
variable among ^1, . . . ,.^„_i,m — Sn-i is m — Sn-i- Hence, ordering the 
variables, we obtain using (|18.13ip - (|18.132p 

(y(i),...,y(„)) ~ ((m-5„_i,e(i),...,e(n-i)) l^n)- (18.134) 
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Finally, observe that \Sn-i — vn\ ^ (5„n w.h.p. and 

P(^j > en for some j ^ n - l) ^ nP(^i > en) = 0{n'^~^) 0. 
Hence, P(<?n) — ^ 1, and thus 

((m-5„_i,C|i),...,C(„_i)) I in) ^ (18.135) 
The result (I18.11ip follows from (|18.134l) and (I18.135|) . 

(iv) : By (hi), m — nv — 

^(1) ~ Er=i - r^^^ and (jl8.112l) follows by 
standard results on domains of attraction for stable distributions, see e.g. 
Felled [sl, Section XVII. 5]. 

(v) : By (hi), 1^(2) ~ and (jl8.114p follows by standard results on the 
maximum of i.i.d. random variables, as in e.g. Leadbetter, Lindgren and 
Rootzen [H]: using P(^ > x) ~ ca~^x~°' as x — )• oo, we have 

P(y(i) ^ xn^/") = P(^|i) ^ xn^/") + o(l) = P(^ ^ xn^/")"-^ + o(l) 

= (1 - (ca-i + o(l))(xni/")-")""' + 0(1) 

— )• exp(— ca~^x~"). 



(vi): Similar, cf. Leadbetter, Lindgren and Rootzen |82l . Section 2.2]. □ 



Example 18.35. If we take = k\, then v = p = 0. Consider the tree case 
m = n—1. By Example l9.8l translating to balls-in-boxes, w.h.p. there are A^i 
boxes with 1 ball each and a single box with the remaining n — 1 — A^i balls, 

while all other boxes are empty; furthermore, A'^i — ^ Po(l) so A^^i = Op(l). 
Hence, Y(i) = n — Op(l) and Y(2) ^ 1 w.h.p. 

If we take Wk = fc!" with < a < 1, and still m = n — 1, then by 
Example ED and [H], y(i) =n- 0^{n^~'^) = n- Op{n) and Y^2) ^ [l/aJ 
w.h.p. 

If we take Wk = with a > 1, and still m = n — 1, then by Example 19. 9^ 
w.h.p. there is a single box containing all n — 1 balls; thus Y(i) = n — 1 and 
^(2) = w.h.p. 

In particular, (|18.106p - (|18.107p hold, with A = 1 and = 0, for all three 
cases. We guess that the same is true for any A < oo, but we have not 
checked the details. 

Example 18.36. We consider the tree case m = n—1. Let T, := {kQ, ki, . . .} 
be an infinite set with /cq = 0, fei = 1, /c2 = 2, and kj for j ^ 3 chosen 
recursively as specified below. Let Wk = {k + 1)~^ for A; G S, and Wk = 
otherwise; thus, supp(w) = S. (S = No gives Example 19.71 with /3 = 4.) 
Then p = 1 and 

u = m = ^ ET=ok{k + l)-' ^ ^(3) _ ^(4) ^ < 1; 

(18.136) 

thus T = p = 1. 



120 



SVANTE JANSON 



To begin with, we require that kj ^ J^j-i for j ^ 3. Take n = kj. A 
good allocation of n — 1 balls in n boxes has at most kj-i balls in any box, 
since n — 1 < A:^ , so 

Y(i) ^ k.j^i ^ kj/j = n/j. (18.137) 

Hence, for n in the subsequence {fcj}, the random allocation Bn-i,n has 
Y(i) = o(n). 

Next, suppose that ko, . . . , kj^i are given, and let w^'^^-^) be w truncated 
at kj_i as in (jl2.4p : for ease of notation we denote the corresponding gener- 
ating function by ^j(t) := Xlto ^^t''' and write ^'j(i) := t^'j{t)/^j{t) and 
Zj{m,n) := Z(m, n; w^'^J-i)). Note that (|18.136p applies to each too, 
and thus 

< 0.2. (18.138) 

Take n = 3k j (where kj is not yet determined). A good allocation with 
n — 1 balls has at most 2 boxes with kj balls, and for the remaining boxes 
the weights w and w^'^^^i) coincide. We thus obtain 

Z{3kj — 1, 3kj) = Zj{3kj — 1, 3kj) + 3kjWkjZj{2kj — 1, 3kj — 1) 

+ (^^^'^wlZjik, - 1,3% - 2). (18.139) 

Let the three terms on the right-hand side be ^0)^i)^2) where Ai corre- 
sponds to the case when i boxes have kj balls. The generating function $j 
is a polynomial, with radius of convergence pj = oo and, by Lemma \3.1\ 
Vj := ^j{oo) = a;(w('=^-i)) = kj-i ^ 2. Define t,t' and t" by ^ j{Tj) = 1, 
qjj{T'j) = 2/3, ^'j(rj') = 1/3. Since ^'^(1) < 1/3 by (118. 1381) . we have 
1 < Tj < Tj < Tj < oo. 

Theorem 117.11 applies to each term Ai in ()18.139p . with A = 1,|,^, re- 
spectively; hence, as kj — )• cxd, 

log = 3kj log MZll + o{kj), (18.140) 
log = 3k, log + o{kj), (18.141) 



By (fTOTel) and rj' > 1 



log A2 = 3k, log + oikj). (18.142) 

(r - jV'^ 



^ $,(r;') 



and 

(r')2/3 - (^")2/3 ^ ( //)l/3- 
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Hence, the constant multiplying kj is larger in ()18.142|) than in (I18.140p and 
(|18.14ip . so by choosing kj large enough, we obtain A2 > jAi and A2 > JAq, 
and thus 

P(53fc^-i,3fc, has 2 boxes with kj balls) = . , , . > (18.143) 

Ao + Ai + A2 J 

This constructs recursively the sequence (kj) and thus S and w, and 
()18.143p shows that for n in the subsequence {3kj)j, Bn-i,n w.h.p. has 2 
boxes with n/3 balls each. 

By Lemma 116. 1^ it follows that, for this subsequence, Tn w.h.p. has 2 
nodes with outdegrees n/3. 

To summarise, we have found a weight sequence with < < 1 such 
that, with m = n — 1, for one subsequence 

y(i)/n^O (18.144) 

and for another subsequence w.h.p. 

Y^) = ^(2) = n/3. (18.145) 

Hence, neither (jl8.106p nor (jl8.107p holds. (It is easy to modify the con- 
struction such that for every i ^ 1, there is a subsequence with y(i) = • • • = 
y(,) = n/(^ + l).) 

Example 18.37. Let S := {0} U {2* : i ^ 0}. We will construct a weight 
sequence w recursively with support supp(w) = S and p = 0. Let t^o = 1- 

Let i ^ 0. If wq, . . . , W2i-i are fixed and we let W2i — ?• 00, then for every 
m with 2* ^ m < 2^^^ and every n, 

IP(-Bm,n contains a box with 2* balls) 1. (18.146) 

Hence, we can recursively choose 1^2* so large that, for every i ^ 0, if 2* ^ 
m < 2*+i and 2^ ^ n ^ 22% then, by (fTOSjl . 

¥{Bm,n contains a box with 2* balls) > 1 - r^. (18.147) 

We further take ^ (2*)!; thus p = and 1/ = 0. 

Consider the tree case, m = n — 1. Thus A = 1. If 2* < n ^ 2*+^, then 
()18.147p applies and shows that Bn-i^n w.h.p. contains a box with 2* balls, 
so w.h.p. 

= 2Li°g2("-i)J = 2ri°g2"l-i. (18.148) 

Hence, y(i)/n w.h.p. is a (non-random) value that oscillates between ^ and 1, 
depending on the fractional part {log2 n} of log2 n. Consequently, (jl8.106p 
holds for subsequences such that / {log2 n} — )• 0, but not in general. 

Moreover, conditioned on the existence of a box with 2* balls, the re- 
mainder of the allocation is a random allocation B^_2i „_i of the remaining 
m — T' balls in n — 1 boxes. For example, if n = 2*"^^, so m = 2*"'"^ — 1, we 
have m — 2* = 2* — 1, and we can apply (|18.147|) again (with i — 1) to see 
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that w.h.p. Y(2) = 2* ""^ = 11/4. Continuing in the same way we see that for 
n in the subsequence (2*), we have, for each fixed j, w.h.p. 

Yt^j) = (18.149) 

Hence neither (|18.106p nor (jl8.107p holds in this case. 

Similar results follow easily for other subsequences. For example, for n in 
the subsequence ([r2*J)j^i, where ^ < r < 1 and r has the infinite binary 
expansion r = 2~^i + 2~^^ + . . . , with 1 = ii < £2 < ■ ■ ■ , we have w.h.p. 
y^j) = 2~^' [n/r] for each fixed j. 

Example 18.38. Let again m = n — 1, so A = 1. Taking Wk = k\ for 
k G supp(w) = {0} U {i! : i ^ 0}, we obtain an example with p = and thus 
1^ = such that Y(x)/n — )• for some subsequences, for example for n = i\ 
(since then Y(i) ^ {i — 1)!)- 

Problem 18.39. Is ^(i)/^ possible when < u < X? Example US. 36\ 
shows that this is possible for a subsequence, but we conjecture that it is not 
possible for the full sequence, and, a little stronger, that there always is some 
e > and some subsequence along which Y^x) ^ en w.h.p. 

Problem 18.40. Is possible when X> u = 0? fExample \18.38\ 

shows that this is possible for a subsequence.) 

We expect that bad behaviour as in the examples above only can occur 
for quite irregular weight sequences, but we have no general result beyond 
Theorem 118.331 We formulate two natural problems. 

Problem 18.41. Suppose that Wk ^ Wk+i for all (large) k. Does this imply 
that (jl8.in6l) - (118. 1071) hold when \>u? 

Problem 18.42. Suppose that Wk+i/wk — )• 00 as k ^ 00. (Hence, p = 
and V = O.j Does this imply that (|18.106|) - (|18.107p hold when \> v? 

18.7. Applications to random forests. We give some applications of the 
results above to the size of the largest tree(s) in different types of random 
forests witn n trees and m ^ n nodes. We consider only the case m/n ^ X 
with 1 < A < 00; for simplicity we further assume that m = An + 0(1), 
although this can be relaxed and, moreover, the general case m/n ^ X can 
be handled by using A„ := m/n and the corresponding r„ := r(A„) as in 
Theorem 110. 6| for details and for results in the cases m = n + o{n) and 
m/n ^ 00, see Pavlov I9I HI, S, Isl] , Kolchin Luczak and Pittel [13], 
Kazimirov and Pavlov 72] and Bernikovich and Pavlov [l^. 

The random forests considered here are described by balls-in-boxes with 
weight sequences with wq = and wi > 0, see Section [TTl As usual, we 
use (without further comments) the argument in Remark 110.81 to extend 
theorems above to the case wq = 0. (See Remark 118.211 ) 

We first consider random rooted forests as in Example 111.61 We have 

Wk = —r. =k~^''^e^, as /c ^ 00, (18.150) 

kl V2vr 
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and thus Wk+i/wk e as A; — )• oo. (Alternatively, we may use Wk ■= 
^Wk ~ (27r)~^/^A;^^/^, see Example 111.101 ) Since v = oo, see Examples 



111.61 and lll.lOl X < and Theorem 118.191 applies for any A G (1,cxd). 
We have a = e and thus, by (lll.27p . 

^"^3V^G(0,1 



re 



and, consequently, 
log(l/g) 

As k 

WkT 



logq 



A 



log 1 



> 0. 



oo, by (fTOTl) . (fTL26]l and (|18.15()h . 
A 



It follows that TTh 



A-1 



A-1 



'fc(n) = 0(l/n) for 
and then ()18.69p yields 



log ^^ — I log log n 
log(l/g) 



+ o(i), 



18.151) 



;i8.152) 



(18.153) 



;i8.154) 



A 



-/i;(n 



1-3/2 



Alog3/2(l/g) 



-nloe 



-3/2 



n. 



/2^{X-l){l-q) ' ' \/27r(A-l)(l 

(18.155) 

Consequently, Theorem 118.19( 11) yields the following theorem for the maxi- 
mal tree size ! this is due to Pavlov [q^, (in a slightly different formu- 
lation), who also gives further results. We further use Theorem 118.16( 1) to 
give a simple estimate for the size Y(j) of the j:th largest tree. (More precise 
limit results for Y(j) are also easily obtained from (118. 50p .) 



Theorem 18.43. For a random rooted forest, with m 
1 < A < oo, 

log n — I log log n -I- log 6 + 



(1) 



log(l/g) 

where W has the Gumhel distribution ()18.63p anc 

Alog3/2(i/g) 



V27r(A-l)(l-g) 

with q given by (|18.15ip - (ll8.152ll . 

Furthermore, Y(^j^ = Y^^) + Op(l) for each fixed j. 



An -|- 0(1) where 
(18.156) 

(18.157) 

□ 



Next, let us, more generally, consider a random simply generated forest 
as in Example 111.81 defined by a weight sequence w. Then the tree sizes in 
the random forest are distributed as balls-in-boxes with the weight sequence 
(Zfc)^Q, where Zk is the partition function (12. 5p for simply generated trees 
with weight sequence w (and Zq = 0). 
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We assume that v{w) ^ 1; thus there exists ri > such that ^'(ri) = 1, 
and then w' := {TiWk/^{Ti))k is an equivalent probability weight sequence 
with expectation 1, see Lemma 14.21 (n is the same as r in Theorem 17. H 
but here we need to consider several different r's so we modify the nota- 
tion.) This probabihty weight sequence w' defines the same random forest, 
which thus can be reahzed as a conditioned critical Galton-Watson forest. 
Recall from (I4.10p and Theorem 17. II that the probability distribution w' has 
variance o"^ = ti^'{ti); we assume that o"^ is finite, which always holds if 
z/(w) > 1 and thus ri < /9(w). We further assume, for simplicity, that w 
has span 1. We then have the following generalization of Theorem 118.431 
see Pavlov (ii, IqH ]. where also further results are given. 



Theorem 18.44. Consider a simply generated random forest defined by a 
weight sequence w, and assume that m = An + 0(1) where 1 < A < oo. 
Suppose that z^(w) ^ 1 and span(w) = 1. Define ti > by ^'(ri) = 1, and 
assume that a'^ := ri*I''(ri) < oo (this is automatic i/ z/(w) > !)■ Define 
further T2 > by 

^{t2) = 1 - 1/A 



and let 



Then < q < 1 and 



T2 



Y, 



(1) 



^(t2) n • 

log n — I log log n + log b+W 



log(l/g) 

where W has the Gumbel distribution (|18.63p anc 

Tilog3/2(l/g) 



b :-- 



(18.158) 
(18.159) 

(18.160) 
(18.161) 



Furthermore, Yq-^ 



T2V2^{l-q)' 
Y(i) + Op(l) for each fixed j. 

Proof. Replace w by the equivalent probability weight sequence w = (wk) 
with ivk := T2Wk/^{T2)- This probability weight sequence has expectation 
^("^2) < 1 by (j4.9p . and using it we realize the random forest as a conditioned 
subcritical Galton-Watson forest. The partition function for w is by (j4.3p 
and Theorem 117.111 



$(r2) 



-Zk 



1 



-k-l 



f(ri)* 



k-1 



;i8.162) 



Moreover, by (12. 6|) . {Zk) is the distribution of the size of a Galton-Watson 
process with offspring distribution w. Since this offspring distribution is 
subcritical with expectation ^(t2) < 1, the size distribution (Z^) has finite 
mean 



A:=0 



1 - ^{t2) 



A, 



;i8.163) 
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by our choice of T2. 

The sizes of the trees in the random forest are distributed as balls-in-boxes 
with the weight sequence {Zk), see Example II 1.81 We apply Theorem 118. 19| 
translating Wk to Z^- By (I18.162p . 

Zk+i/Zk ^ a := — ^ • as A; ^ 00. (18.164) 

Note further that (with this weight sequence {Zk)) t in Theorem 118.191 is 
chosen such that the equivalent probability weight sequence [t^ Zk/ Z{t)^ 

has expectation A. We have already constructed {Zk) such that it is a 

probability weight sequence with this expectation, see (jl8.163p : hence we 
have T = 1 and q = a, which yields (|18.159p . 
As in (|18.154p, Tr^^n) = ^fc(n) = e(l/n) for 

, , , log n — I log log n ^ , , 
Hn) = , + Oil), 18.165 

log(l/g) 

and then (fTOTD yields, by (118.1621) . 

N ~ k{n)-y' ~ nlog^^(l/g) ^ 3/^ ^_ 

V2^T2il-q) T2V2^{l-q) 

The result (|18.160p now follows from Theorem 118.19( 11). Finally, again, 
Theorem 118.16( 1) gives the estimate for Y(j). □ 

Example 18.45. Consider a random ordered rooted forest. This is obtained 
by the weight sequence Wk = 1, see Example 111.81 and we have by (j9.ip - 
(lO) ^'(t) = 1/(1 -t) and *(t) = t/{l-t). Hence, n = 1/2 and = 2 (see 
Example 19. ip : furthermore, (jl8.158p is T2/(1 — T2) = 1 — 1/A, which has the 
solution 

= (18.167) 

Consequently, Theorem 1 1 8 . 44 1 savs that (jl8.160p holds, with the parameters 
q and b given by, see (|18.159p and (|18.16ip . 

-r2(l-'r2) , , 4A(A-1) 1 

and 

'=4!I(X^^°^'''(^/'^)- ^''-'"'^ 

Example 18.46. The random rooted unlabelled forest in Example 111.111 
is described by a weight sequence that also satisfies Wk ~ cik'^^"^ as 
A; — >• 00, and we thus again obtain (jl8.160p . although the parameters q and 
h now are implicitly defined using the generating function of the number of 
unlabelled rooted trees, see Pavlov 97l |. 
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Example 18.47. For the random recursive forest in Example 111.13^ we 
have 

Wk = k-^. (18.170) 
Thus Theorem 118.191 applies with a = 1 and g = r G (0, 1) given by 



A, 



;i8.i7i) 



(l-g)|log(l-<7) 

see ()11.49p . (Recall that = oo, so we can take any A > 1 here.) In this 
case, see pXIgjl . 7rfc(„) = A:(n)-ig^(")/| log(l - q)\ = Q{l/n) for 

log n — log log n 



log(l/g) 

cf. (|18.154p . and then IHKm yields 

log(l/g) 



+ 0(1), 



N 



(l-g)|log(l 
Consequently, Theorem 118. 19( ii) yields 



j-nlog 



n. 



Yd 



(1) 



log n — log log n + log b + W 



;i8.172) 



(18.173) 



;i8.174) 



(18.175) 



log(l/g) 

where W has the Gumbel distribution (jl8.63p and, using ()18.17ip . 
^ ,^ log(l/g) ^ Alog(l/g) 
(l - q)\log{l - q)\ q 

We thus obtain a result similar to the cases above, but with a different 
coefficient for log log n in (|18.174p . See Pavlov and Loseva 98|] for further 
results. 

If we consider the random unrooted forest in Example 111.71 we find dif- 
ferent results. In this case, the tree sizes are described by balls-in-boxes 
with the weight sequence = k^~'^/kl, k ^ 1 (and wq = 0). Alternatively, 
we can use the probability weight sequences in (lll.36p . in particular the 
probability weight sequence, recalling ^>(e-^) = 1/2 from (|11.32l] . 



Wk 



$(e-i) 

which by Stirling's formula satisfies 

2 



2/.fc-2g-fc 
kl 



k-'/\ 



as k 



oo. 



;i8.176) 



;i8.177) 



Since we now have = 2 < oo, see Examples 111.71 and I11.1U| there is a 
phase transition at A = 2. We show in the theorem below that for A < 2 
we have a result similar to Theorems 118.431 and 118.441 with maximal tree 
size Y(i) = Op(logn), but for A > 2 there is a unique giant tree with size of 
order n. At the phase transition, with m/n — )• 2, the result depends on the 
rate of convergence of m/n; if, for example, m = 2n exactly, the maximal 
size is of order n^/^; see further Luczak and Pittel ^], where precise results 
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for general m = m{n) are given. (By the proof below, (iii) in the following 
theorem holds as soon as m/n — )• A > 2, but (i) and (ii) are more sensitive.) 

Theorem 18.48. Consider a random unrooted forest, and assume that m = 
An + 0(1) where 1 < A < oo. 

(i) // 1< A < 2, let 

„A- 1 



Q 



A 



=,2/A-l 



Then < q < 1 and 



Y, 



(1) 



log n — I log log n + log b + W 



log(l/g) 

where W has the Gumhel distribution (jl8.63p and 

2^/2^(A - 1)(1 - g) 

Furthermore, Y(j) = y(i) + Op(l) for each fixed j. 
(ii) // A = 2, then 

,2/3 



(18.178) 



(18.179) 



(18.180) 



;i8.181) 



for each j, where r]j > are some random variables. The distribution 
of rji is given by (118. 97p with a = 3/2 and a = (I/tt)^/"^. 
(iii) 7/2 < A < oo, then y(i) = (A — 2)n + Op(n^/^). More precisely. 



n 



-2/3, 



m — 2n — y(i)] 



X, 



;i8.182) 



where X is a ^-stable random variable with Laplace transform 



,25/2 

"I- 



For j ^ 2, = Op(n2/3), and n'^/Sy^^.^ 
the Frechet distribution 



Ret ^ 0. 

d 



;i8.183) 



Wj where W2 has 



P(W2 ^ 2;) = exp 



23/2 

^ J 

3V^ 



-3/2 



X > 0. 



;i8.184) 



and, more generally, Wj has the density function ()18.117p with d = 
(2/7r)V2 and a = 3/2. 

Note that the exponents |, 1 and f in (118.1501) . (I18.17()p and (I18.177p 
appear as coefficients of log log n in ()18.156p . (jl8.174p and (|18.179p . respec- 
tively. 

Proof, (i) : This is very similar to the proofs of Theorems 118.431 and 118.441 
We use Wk = k^~'^/kl. Then, as for rooted forests and (|18.150p above, 
Wk+i/wk — )■ e as A; — )• 00. Further, r is given by (jll.34p . and thus q := re is 
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given by (I18.178p . It follows, cf. (I18.154p and ^8.17711 . that n^^^ = Q{l/n) 
for 

, , log n — ^ log log n ^ ^ , ^ 

^(")= log(l/') (^'-'''^ 
and then (|18.69p yields 

$(r)(l-g) 2^/2^(A-l)(l-(7) ^ ^ 

^^li^^!^!ii/^nlog-/^n. (18.186) 
2^/2^(A- 1)(1 -g) ^ ^ 

Hence Theorem [TSlSJii) yields (118.1791) . 

(ii) : We use the equivalent probability weight sequence (wk) given by 
P8.176p . By (I18.177P , it satisfies the assumptions in Example 118.271 with 
a = 3/2 and c = (2/7r)i/2; thus (I18.18ip follows from (fTSMT) . and ^TK97]i 
in Remark 118.281 applies. 

(iii) : We use again the probability weight sequence (wk) and apply The- 
orem [T833l We have c' = c = (2/7r)^/2 (|18.176I) . and thus c'r(-3/2) 



c 



/4 



r(l/2) = 25/2/3 and c'/a = 2^/2/(3^^). □ 



Example 18.49. The random unrooted unlabelled forest (with labelled 
trees) in Example 111.111 is described by another weight sequence that sat- 
isfies Wk ~ ck'^^"^ as /c — 7- oo, and we thus obtain a result similar to 
Theorem 118.481 although the parameters differ (they can be obtained from 
the generating function of the number of unlabelled trees); in particular, the 
phase transition appears when A is ~ 2.0513, see Bernikovich and Pavlov 
[13] for details. 

We do not know any corresponding results for random completely unla- 
belled forests (n unlabelled trees consisting of m unlabelled nodes); as said 
in Example 111.111 they cannot be described by balls-in-boxes. 

19. Large nodes in simply generated trees with u <l 

In the tree case with u the results in Section [18.61 show condensation 
in the form of one or, sometimes, several nodes with very large degree, 
together making up the "missing mass" of about {l — v)n. On the other hand. 
Theorem 17.11 shows concentration in a somewhat different form, with a limit 
tree T having exactly one node of infinite degree. This node corresponds to 
a node with very large degree in Tn for n large but finite. How large is the 
degree? Why do we only see one node with very large degree in Theorem l7.lt 
but sometimes several nodes with large degrees above (Examples 118.361 and 

The latter question is easily answered: recall that the convergence in 
Theorem 17.11 means convergence of the truncated trees ( "left balls" ) Tn™'^ , 
see Lemma [131 thus we only see a small part of the tree close to the root, 
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and the two pictures above are reconciled: if m is large but fixed, then in 
the set V{T) D yt™] of nodes, there is with probability close to 1 exactly 
one node with very large degree. (There may be several nodes with very 
large degree in the tree, but for any fixed m, w.h.p. at most one of them 
IS m 

yM.) Of course, to make this precise, we would have to define "very 
large", for example as below using a sequence 0„ growing slowly to oo 
as in Lemma 118. 3H but we are at the moment satisfied with an intuitive 
description. 

To see how large the "very large" degree is, let us first look at the root. 
Lemma 114.71 savs that the distribution of the root degree is the size-biased 
distribution of Yi . We can write (|14.7p as 

P(4jo) = d) = ^^J2nY. = d) = ^EJP(^O) = (19.1) 

i=l j=l 

hence the distribution of the root degree can be described by: sample 
(Yi, . . . , Yn) and then take Y^i-j with probability Y(i)/(n— 1), Y(2) with prob- 
ability Y(2) /(n - 1) , .... 

In particular, if Yj-i) = (1 — i^)n + Op(n), then (jl9.ip implies 

P(d+ (o) = y(i)) = l-z. + o(l), (19.2) 

and comparing with Theorem l7.10l we see that w.h.p. either the root degree is 
small (more precisely, Op(l)), or it is the maximum outdegree Y(^iy However, 
we also see that if y(i) is not (1 — z/)n + Op(n), then this conclusion does 
not hold; for example, in Example 118.371 for n in the subsequence (2*) where 
(|18.149p holds for each fixed j, 

P(d+^(o) = 2"%) ^ 2-K (19.3) 

In the case = 0, we only have to consider the root, since the node 
with infinite degree in T always is the root, but for < < 1, the node 
with infinite degree in T may be somewhere else. We shall see that it 
corresponds to a node in Tn with a large degree having (asymptotically) 
the same distribution as the root degree just considered, conditioned to be 
"large". 

To make this precise, let 0„ — )• oo be a fixed sequence which increases so 
slowly that Lemma ll8.31( ii) holds. We say that an outdegree d^{v) is large 
if it is greater than fi^; we then also say that the node v is large. (Note that 
by Lemma Il8.31( ii) . w.h.p. at least one large node exists.) For each n, let 
Dn by a random variable whose distribution is the size-biased distribution 
of a large outdegree, i.e. of (Yi \ Yi > 0„): 

~ _ k¥{Yi = k) _ kENk _ kENk 

^ " ^ Ei>njnyi = l) Ei>nJ^Ni (l-z. + o(l))n' 

(19.4) 

for k > Qn and P(-Dn = k) = otherwise. Equivalently, in view of 
Lemma ri4.7l D„ has the distribution of the root degree dj-^ (o) conditioned to 
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be greater than r2„. See also ()19.ip . and note that if ^(i) = — i^)n + Op{n) , 

then Dn ~ ^(i); we may take Dn = 1^(1) w.h.p.; in this case (but not 

otherwise) we thus have = (1 — z^)n + Op(n). 

Note that if is another such sequence, similarly defining a random 
variable D'^, then E/>f7„ ^IP(^i = /) ~ (1 - v)n ~ E«>q; = 0> and it 

follows that Dn ~ D^; hence the choice of Qn will not matter below. 

We claim that, w.h.p., the infinite outdegree in T corresponds to an outde- 
gree Dn in 7^. To formalise this, recall from Section [6] that we may consider 
our trees as subtrees of the infinite tree C/qo with node set Voo, and that 
the convergence of trees defined there means convergence of each d'^^v), see 
(j6.6p . Let T be the random infinite tree defined in Section [SJ we are in case 



(T2), and thus T has a single node v with outdegree d't[v) = oo. We assume 

that Dn and T are independent, and define the modified degree sequence 

(di-(v), di-(v) < oo, 
d+{v) := r (19.5) 

^ \Dn, d±iv) =oo. ^ ' 

We thus change the single infinite value to the finite leaving all other 
values unchanged. (Note that di{v) may depend on n, since Dn does.) We 
then have the following theorem. 

Theorem 19.1. For any finite set of nodes vi, . . . ,vi £ Voo, 

{dl{vi),...,dl{ve)) ~ {d+{vi),...,d+{ve)). (19.6) 

Proof. Let e > 0, and let v* denote the unique node in T with di{v*) = oo. 

By increasing the set {vi, . . . ,Vi}, we may assume that it equals V^"^^ (see 
Section [6|) for some m, and that m is so large that P(t;* € yM) > i _ 
We may then find K < oo such that 

^{d±{v) £ {K,oo) for some v £ V^""^) < e. 

Since Tn — — >• T by Theorem 17. H we may by the Skorohod coupling theorem 
(69l . Theorem 4.30] assume that the random trees are coupled such that 
Tn ^ T a.s., and thus d'^^{v) — t- d'^{v) a.s. for every v. Then, for large 

n, with probability > 1 - 3e, G l/H, dX- [v) = d±{v) = d±{v) ^ K 

for ah V £ \ {v*}, and (i:j-^(w*) — ^ d't(v*) = oo. We may assume that 

Qn — 7- oo so slowly that furthermore ¥{dj-^{v*) ^ $7„) ^ e. (Recall that we 
may change r2„ without affecting the result (|19.6p .) 

Let n be so large that also Qn > ^n and Vtn > K. It follows from 
Lemma 114.91 that for each choice of v' £ yt™] and numbers d{v) for v £ 
\ v', and k > Qn, 



\d^^{v) = d{v) for V £ l/I™! \ {v'] and d+-^{v') = k) 
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= {k + 0{l))Ci{d{v)},v' ,n)F{Yi = k) 
for some constant C {{d{v)} , v' , n) ^ not depending on k; hence, by (|19.4p . 

P(d+ = k I d+ (w) = d{v) for V e \ {v'} and d^Jv') > Qn) 

There is only a finite number of choices of v' and ))^gy[m]y{^/}, and it 

follows that we may choose the coupling of Tn and T above such that also 
dj-^{v*) = Dn w.h.p.; thus, with probability > 1 — 4e — o(l), dj-^{v) = 
for all 77 G yM. 

The result follows since e > is arbitrary. □ 

We give some variations of this result, where we replace di{v) by the 

degree sequences of some random trees obtained by modifying T. (Note 
that ) is not the degree sequence of a tree.) 

First, let Tin be the random tree obtained by pruning the tree T at the 
node V* with infinite outdegree, keeping only the first Dn children of v* . 
Then Tin is a locally finite tree, and, in fact, it is a.s. finite. The random 
tree Tin can be constructed as T in Section [5l starting with a spine, and 
then adding independent Galton- Watson trees to it, but now the number of 
children of a node in the spine is given by a finite random variable ^„ with 
the distribution 

F{in = k)= =k)+ P(e = OO) P(5„ = k) = kTTk + {I - 1^) FiDn = k). 

(19.7) 

The nodes not in the spine (the normal nodes) have offspring distribution 
(vTfc) as before. (This holds also for the following modifications.) 

The spine in Tin stops when we obtain ^ = oo, but we may also define 
another random tree Tin by continuing the spine to infinity; this defines a 
random infinite but locally finite tree having an infinite spine; each node in 
the spine has a number of children with the distribution in (119. 7p . and the 
spine continues with a uniformly randomly chosen child. Equivalently, Tin 
can be defined by a Galton-Watson process with normal and special nodes 
as in Section m but with the offspring distribution for special nodes changed 
from ([521) to (fT97fl) . 

Finally, let 1^ by a random variable with the size-biased distribution of 
Yi: 

kF(Yi = k) kENk 

F{Yn = k)= ^ \ > = ^, 19.8 

(n — l)/n n — 1 

recahing that Y.k = n - 1; cf. (fT9TTl and ([19:4]) . (Thus Yn = d:^^(o) by 

Lemma [l4.7l and (119.11) .) Define the infinite, locally finite random tree Tsn by 
the same Galton-Watson process again, but now with offspring distribution 
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Yn for special nodes. (This does not involve or r2„.) Thus Tin also has 
an infinite spine. 

We then have the following version of Theorem 119. H where we also use 
the metric 5i on Tif defined by 

6i{Ti,T2) := l/sup{m ^ 1 : = d-T^i^) for v G ^1™]}. (19.9) 

Theorem 19.2. For j = 1, 2, 3, and any finite set of nodes vi, . . . ,Vi £ Voo, 

{dl{v,),...,dl{ve)) ^ (4 {v,),...,d+ (v,)). (19.10) 

Ijn I jn 

Equivalently, there is a coupling of Tn and Tjn such that Si{Tn,Tjn) —?■ as 
n — )• oo. 

Proof. If ri„ > m we have Dn > m and then the branches of T pruned to 
make Tin are all outside yt™-], and thus di = defined in (119.51) for all 

^ Tin T 

V E l/M. Thus the result for Tin follows from Theorem 119.11 

Next, for any given m, and for any endpoint x of the spine of 71™, the 
probability that the continuation in Tin of the spine contains some node in 
is less than m/Qn = o(l); thus, w.h.p. Tin and Tn are equal on any 

Finally, Lemma 119.31 below implies that we can couple Tin and Tsn such 

that they w.h.p. agree on each F^"*); then Tin and Tsn are w.h.p. equal on 
eachl/H. □ 

Lemma 19.3. ^„ ^ Yn- 

Proof. For each fixed k, P(^n = k) = knk as soon as > k, and P(l^ = 
k) k-Kk by (fTMl) and Theorem [1071 Hence, 

\F{in = k)-F{Yn = k)\^0. (19.11) 

By (fT97n) . (fTOl) and (fTMIl . uniformly for /o 

Fiin = k) = kTTk + (1 - ^) ^^(^i = ^) = kTTk + (1 + o(l)) P(yn = A:); 

1 — V + o(l) ^ 

hence 

|P(en = fc)-P(l'n = k)\^ Yl (^vrfc+o(l)P(Fn = A:)) = ^ A;7rfc+o(l) 

J^^^^^^fi /c^^f^TT, /c^^S"^^^, 

(19.12) 

Further, for any fixed K, 

^ (Pdn = k)- ¥(Yn = k))^^ Yl ^(^" = ^) = E ^^'^^ (19.13) 

fc=_fS'+l k=K+l k=K+l 



Using Lemma [ISi ^vii)] together with (119. lip for A; ^ (I19.12|) and (119. 13p 
we obtain 

oo oo 

dTv(ln,l'n) = ^(P(en = A;)-P(Fn = A:))^^ ^ A:^fc + o(l). (19.14) 

k=l k=K+l 
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Since K is arbitrary and ^^i^ kuk < oo, it follows that dTY{Cm ^n) — >■ 0. □ 

20. Further results and problems 

20.1. Level widths. Let, as in Remark 15.61 IkiT) denote the number of 
nodes with distance k to the root in a rooted tree T. 

If I' ^ 1, then 7" is a locally finite tree so all level widths IkiT) are finite. It 
follows easily from the characterisation of convergence in Lemma 16.21 that, 
in this case , the funct ional Ik is continuous at T, and thus Theorem 17.11 
implies (see Billingsley isl . Corollary 1, p. 31]) 



IkiTn) A Ikif) < OO (20.1) 

for each k ^ 0. 

On the other hand, if < 1, then T has a node with infinite outdegree; 
this node has a random distance L — 1 to the root, where L as in Section [5] 
is the length of the spine, and thus Il{T) = oo. 

In the case < < 1, we have ttq < 1 and P(^ ^ 1) = 1 — ttq > 0, 
so for any j, there is a positive probability that the Galton- Watson tree 
T has height at least j, and it follows that of the infinitely many copies 
of T that start in generation L, a.s. infinitely many will survive at least 
until generation L + j. Consequently, a.s., lk{T) = oo for all k ^ L, while 
hiT) < oo for k < L. It follows easily from Lemma [6.31 that in this case too, 
for each /c ^ 0, the mapping Ik : 1^ ^ Nq is continuous at T- Consequently, 

UTn) ^kif) ^oo, A; = 0,1,..., (20.2) 

with ¥{lk{T) < oo) = P(L > k) = . (Recall that /u = i/ in this case by 
«.) 

When z/ = 0, however, (j20.2p does not always hold. By Example 15. 1^ 
T is an infinite star, with li{T) = oo and hiT) = for all k ^ 2. By 
Theorem [nol hiT) = d^^io) -^^=hif), so (1202]) holds for A; = 1 (and 
trivially for A; = 0) in the case 1^ = too (with hiT) = oo). However, by 
Example 19. 8|, if Wk = kl, then hiT) Po(l), so hiTn) does not converge 
to hiT) = 0. Similarly, by Example 19.91 if i ^ 2 and Wk = A;!" with 
< a < l/(j — 1), then the number of paths of length j attached to the 
root in T tends to oo (in probability), so hiT) — ^ co, while hiT) = 0. 

Turning to moments, we have for the expectation, by (j5.8p . E/fc(T) = oo 
if < 1/ < 1 or fj^ = oo; in this case (j20.ip - (|20.2p and Fatou's lemma yield 
E/fc(r„) ^E/fc(f) = oo. 

If 1/ ^ 1 and < oo, then yields E/fc(T) = l + ka"^ < oo. In this 

case, for each fixed k, the random variables hiTn), n ^ 1, are uniformly in- 
tegrable, and thus (120. ip implies KhiT) — >• E/fc(T), see JansonI sil . Section 



10]. (In the case v > 1, this was shown already by Meir and Moon [85|.) 
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Consequently, for any w with p > and any fixed k, 

E/fc(r„) ^E/fe(f) ^ oo. (20.3) 

(Wlien p = 0, tfiis is not always true, by the examples above.) 

For higher moments, there remains a small gap. Let r ^ 1. When 
< u < 1, (PITB]) trivially implies Elk{TnY ^ ^hify = oo, so sup- 
pose u ^ I. Then, by ([O]), Ef" = E^^'^, so if E^^^ = oo, then 
Kliiry = oo; moreover, each lk{T), k ^ 1, stochastically dominates ^ (con- 
sider the offspring of the A::th node on the spine), and thus E,lk{TY = oo 
for every k ^ 1. Consequently, again immediately by Fatou's lemma and 
(j20.2p . E,lk{TnY — 5- Klk{TY = oo. The only interestin g ca se is thus when 
jg^r+i ^ If r ^ 1 is an integer, it was shown in |59l . Theorem 1.13] 
that E^^'^^ < oo implies that E/fc(7^)^, n ^ 1, are uniformly bounded for 
each k 1. We conjecture that, moreover, lk{TnY ■• n ^ 1, are uniformly 
integrable, which by (|20.ip would yield the following: 

Conjecture 20.1. For every integer r ^ 1 and every k ^ 1, if v > 0, then 

EkiTnY ^^kifY ^oo. (20.4) 

We further conjecture that this holds also for non-integer r > 0. 

One thus has to consider the case E^'''^^ < oo only, and the result from 
^ implies that (imi) holds if E^^^ < oo, since then E/fc(7;)L^J+^ are 
uniformly bounded. 

20.2. Asymptotic normality. In Theorem 17. IH we proved that N^, the 
number of nodes of outdegree d in the random tree Tn, satisfies N^/n vr^. 



In our case la (i^ > 1 or z> = 1 and < oo), Kolchin [76l . Theorem 2.3.1] 
gives the much stronger result that the random variable A^^ is asymptotically 
normal, for every d ^ 0: 

Nd - niTd d ^ ^j^^ 2 



n 



N{0,ai), (20.5) 



with 



a^=-.(l-v^.-^^^i)^). (20.6) 

(In fact, Kolchin fj^ gives a local limit theorem which is a stronger version 
of (12031) .) 



Under the assumption ^'^ < oo, iJansonI |55l . Example 3.4] gave another 



proof of ()20.5p . and showed further joint convergence for different d, with 
asymptotic covariances, using 1^ := 1{^ = k}, 

2 r (T Cov(4,g)Cov(/,,g) - l^l - 1)7:^^ 
^kl = ^ov[lk,li) yar^ ^ T^kOkl-T^kT^l • 

(20.7) 
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Moreover, Janson [55|] showed that if E |^|'' < oo for every r (which in 
particular holds when v > 1 since then t < p and has some exponen- 
tial moment), then convergence of all moments and joint moments holds in 
(|20.5p : in particular 

ENk = riTTk + o{n) and Cov(iVfe, Ni) = nali + o(n). (20.8) 

In the case z/ > 1, Minami [s^ and Drmota [ssl . Section 3.2.1] have 
given other proofs of the (joint) asymptotic normality using the saddle point 
method; Drmota [s^] shows further the stronger moment estimates 

ENd = mTd + 0{l) and Yar = naj + 0(1). (20.9) 

Problem 20.2. Do these results hold in the case u = 1, a'^ < oo without 
extra moment conditions? Do they extend to the case u = 1, a'^ = oo? What 
happens when ^ u < 1'^ 



9 



Problem 20.3. Extend this to the more general case of balls-in-boxes as in 
Theorem \10.4\ (We guess that the case < X < v is easy by the methods in 
the references above, in particular [55.] and fsdi . Section 3.2.1], but we have 
not checked the details.) 



Problem 20.4. Extend this to the subtree counts in Theorem 7.12 



20.3. Height and width. We have studied the random trees Tn without 
any scaling. Since our mode of convergence really means that we consider 
only a finite number of generations at a time, we are really looking at the 
base of the tree, with the first generations. The results in this paper thus 
do not say anything about, for example, the height and width of Tn- (Recall 
that if r is a rooted tree, then the height H{T) := max{A: : lk{T) > 0}, the 
maximum distance from the root, and the width W{T) := max^.{/fc(T)}, the 
largest size of a generation.) However, there are other known results. 

In the case ^ 1, o"^ < oo (the case la in Section [8]), it is well-known that 
both the height H{Tn) and the width W{Tn) of Tn typically are of order y/n; 
more precisely, 

H{Tn)/V^ 2a-^X, (20.10) 

WiTn)/V^ A aX, (20.11) 

where X is some strictly positive random variable (in fact, X equals the 
maximum of a standard Brownian excursion and has what is known as a 



theta distribution), see e.g. Kolchin [TGjjAldous [J], Chassaing, Marckert 



and Yor [25|], Janson [59] and Drmota [33]. There are also results for a 
single level giving an asymptotic di stribution fo r lk{n){Tn) / y/n when the 



level k{n) ~ a^/n for some a > 0, see iKolchinI 7a, Theorem 2.4.5]. 

Since the variance cr^ appears as a parameter in these results, we cannot 
expect any simple extensions to the case o"^ = oo, and even less to the case 
^ z^ < 1. Nevertheless, we conjecture that (I20.10p and (I20.1ip extend 
formally at least to the case u = 1 and cr^ = oo: 
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Conjecture 20.5. If u = 1 and a"^ = oo, then H{Tn)/\/n 0. 
Conjecture 20.6. If v = \ and a"^ = oo, then W{Tn)/\/n — ^ oo. 
Problem 20.7. Does v <l imply that H{Tn)/\/n 0? 

Problem 20.8. Does v <l imply that W{Tn)/\/n oo? 

Furthermore, still in the case ^ 1, cr^ < oo, Addario-Berry, Devroye 
and Janson [l| have shown sub-Gaussian tail estimates for the height and 
width 

¥{H{Tn) ^ x./^) ^ Ce-=^', (20.12) 

HW{Tn) ^ x^/^) ^ Ce-^''\ (20.13) 

uniformly in all x 5^ and n ^ 1 (with some positive constants C and 
c depending on tt and thus on w). In view of (|20.1ip . we cannot expect 
(j20.13p to hold when cj^ = oo (or when u < 1), but we see no reason why 
(j20.12p cannot hold; ()20.10p suggests that H(Tn) typically is smaller when 
cr^ = oo. 

Problem 20.9. Does (j20.12p hold for any weight sequence w (with C and 
c depending on w, but not on x or n)? 

It follows from (l2(DI)]) - (f2(m]) and (f2(n2]) - (l2(U3]l that EH{rn)/y/^ and 
IEV7(7^)/\/n converge to positive numbers. (In fact, the limits are ^/2Tr/a 



and Y^7r/2 a, see e.g. Janson j61|], where also joint moments are computed.) 

Problem 20.10. What are the growth rates ofKH{Tn) andEW{Tn) when 
(T^ = oo or u < 1? 

20 A. Scaled trees. The results (I20.10p ~ (120.1ip . as well as many other re- 
sults on various asymptotics of Tn in the case ^ 1, o"^ < oo, can be seen 
as consequences of the convergence of the tree Tn, after rescaling in a suit- 
able sense in both height and width by y/n, to the continuum random tree 
defined by Aldous 0, H, 0] , see also Le Gall jsO] . (The continuum random 
tree is not an ordinary tree; it is a compact metric space.) This has been 
extended to the case o"^ = oo when tt is in the domain of attraction of a 



stable distribution, see e.g. Duquesne [3j| and Le Gall [80|, iSl]; the limit is 



now a different random metric space called a stable tree. 

Problem 20.11. Is there some kind of similar limiting object in the case 
u < 1 (after suitable scaling)? 

20.5. Random walks. Simple random walk on the infinite random tree T 
has been studied by many authors in the critical case ^ 1, in particular 
when 0"^ < oo, see e.g. Kesten [t^]. Barlow and Kumagai [i^], Durhuus, 
Jonsson and Wheater |35| |. Fui ii and Kumagai [i^], but also when cj^ = oo, 
see Croydon and Kumagai [30|] (assuming attraction to a stable law). 
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A different approach is to study simple random walk on Tn and study 
asymptotics os n — )• oo. For example, by rescaling the tree one can obtain 
convergence to a process on the continuum random tree (when < oo) or 
stable tree (assuming attraction to a stable law), see Croydon jH, H^. 

For u < 1, the simple random walk on T does not make sense, since the 
tree has a node with infinite degree. Nevertheless, it might be interesting 
to study simple random walk on Tn and find asymptotics of interesting 
quantities as n — )• oo. 

20.6. Multi-type conditioned Gallon Watson trees. It seems likely 
that there are results similar to the ones in Section [7] for multi-type Galton- 
Watson trees conditioned on the total size, or perhaps on the number of 
nodes of each type, and for corresponding generalizations of simply gener- 
ated random trees. We are not aware of any such results, however, and leave 
this as an open problem. (See Kurtz, Lyons, Pemantle and Peres for 
related results that presumably are useful.) 



21. DIFFERENT CONDITIONINGS FOR G ALTON- WATSON TREES 

One of the principal objects studied in this paper is the conditioned 
Galton- Watson tree (T | \T\ = n), i.e. a Galton- Watson tree T condi- 
tioned on its total size being n; we then let n — )• oo. This is one way to 
consider very large Galton-Watson trees, but there are also other similar 
conditionings. For comparison, we briefly consider two possibilities; see fur- 
ther Kennedy [t^] and Aldous and Pitman Q. We denote the offspring 
distribution by and its probability generating function by ^{t). 

21.1. Conditioning on |7~| ^ n. If ^ 1, i.e., in the subcritical and 
critical cases, |T| < oo a.s. and thus T conditioned on |7~| ^ n is a mixture 
of (T I |7~| = N) = Tn for N ^ n. It follows immediately from Theorem 17. II 

that (T \\T\=N)^f as n — )■ oo. 

If > 1, i.e., in the supercritical case, on the other hand, the event 
IT] = oo has positive probability, and the events \T\ n decrease to \T\ = 
oo. Consequently, 

(rim^n) A(r||r| = oo), (21.1) 

a supercritical Galton-Watson tree conditioned on non-extinction. 

Remark 21.1. When T is supercritical, the conditioned Galton-Watson 
tree (T | |T| = oo) in (j21.ip can be constructed by a 2-type Galton-Watson 
process, somewhat similar to the construction of T in Section [5j Let q := 
P(|T| < oo) < 1 be the extinction probability, which is given by ^{q) = q. 
Consider a Galton-Watson process T with individuals of two types, mortal 
and immortal^ where a mortal gets only mortal children while an immortal 
may get both mortal and immortal children. The numbers ^' of mortal and 
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^" of immortal children are described by the probabiHty generating functions 
Ex^'y^" = ^m{x) := ^q{x) = ^{qx)/q (21.2) 

for a mortal and 

Ex?'/' = Ux) := ^(^- + (^^-<l)y)-^i<l-) (21.3) 

for an immortal (with the children coming in random order). Note that the 
subtree started by a mortal is subcritical (since ^*m(l) = '^''('i') < 1, cf. (|4.9p l. 
and thus a.s. finite, while every immortal has at least one immortal child 
(since $i(x,0) = 0) and thus the subtree started by an immortal is infinite. 
It is easily verified that T conditioned on non-extinction equals this random 
tree T started with an immor tal, while T condi tioned on extinction equals 
T started with a mortal. (See lAthreva and Nev 1, Section 1.12], where this 



is stated in a somewhat different form.) 

One important difference from 7~ is that T does not have a single spine; 
started with an immortal it has a.s. an uncountable number of infinite paths 
from the root. 

Note that T in the critical case can be seen as a limit case of this con- 
struction. If we let g 1, which requires that we really consider a sequence 
of different distributions with generating functions $("^(t) — )• ^{t), then tak- 
ing the limits in (j21.2p - (j21.3p gives for the limiting critical distribution the 
offspring generating functions ^ra{x) = $(x) and <I>i(x,y) = y$'(x), which 
indeed are the generating functions for the offspring distributions in Sec- 
tion [5] in the critical case (with mortal = normal and immortal = special), 

since Ex?~^y = y'3>'(x) = $i(x,y) by ([57 



21.2. Conditioning on H(T) ^ n. To condition on the height H{T) being 
at least n is the same as conditioning on ln(T) > 0, i.e., that the Galton- 
Watson process survives for at least n generations. 

If > 1, i.e., in the supercritical case, the events ln{T) > decrease to 
|T| = oo. Consequently, 

(r I HiJ) ^n) = {T\ ln{T) > 0) A (T I in = oo), (21.4) 

exactly as when conditioning on IT] ^ in (j21.ip . By Remark I21.H the 
limit equals T, started with an immortal. 

In the subcritical and critical cases, the following result, proved by Kesten 
(at least for E^ = 1, see also Aldous and Pitman @), shows convergence 
to the size-biased Galton- Watson tree T* in Remark 15.71 



Theorem 21.2. Suppose that ^ := ^ 1. Then, as n ^ oo, 

{T I H{T) ^n) = {T\ ln[T) > 0) A r . (21.5) 

Proof. Let r„ := P(/„(T) > 0), the probability of survival for at least n 
generations. Then r„ — t- as n — t- oo. Fix £ > and a tree T with height 
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£. Conditioned on T"*-^^ = T, the remainder of the tree consists of lii{T) 
independent branches, each distributed as T, and thus, for n > £, 



(21.6) 



]P{r^') = T I H{T) > n) 



F{H{T) ^ n) 
P(rW=T)(l-(l-r„_,)^^(^)) 



F{H{T) ^ n) 

Let T^^^ be the set of finite trees of height i. Summing (j21.6p over T £ T^^"* 
yields 1, and thus 

¥{H{T) >n)= nr^"^ = r)(l - (1 - r„_,)'^(^)) (21.7) 



V) 



Dividing by r„_£, and noting that for any N \, (l — (1 — r)^^ jr N as 
r \ 0, we find by monotone convergence 



Y nr^'^ = T)le{T) = E kiT) = (21.5 



Hence, by OLB]) and (fSTTT) . 

-.■•'^'"=7>'''^>=P(r")^T). (21.9) 
M 

Thus, (r I H{T) ^ n)('^) A r*(^), and the result follows by □ 

Note that if = 1, then T* = T, see Remark 15.71 so the limits in The- 
orems [TT] and [212] of T conditioned on IT] = n and H{T) ^ n have the 
same limit. However, in the subcritical case E^ < 1, T* ^ T; moreover, T* 
differs also from the limit in Theorem 17.11 which is T for a conjugated dis- 
tribution, and the same is true in the supercritical case. Hence, as remarked 
by Kennedy [tII, conditioning on |7~| = n and H^T) ^ n give similar results 
(in the sense that the limits as n — )• oo are the same) in the critical case, but 
quite different results in the subcritical and supercritical cases. Similarly, 
conditioning on |T| ^ n and H{T) ^ n give quite different results in the 
subcritical case. Aldous and Pitman [a] remarks that the two different limits 
as n — )• oo both can be intuitively interpreted as "T conditioned on being 
infinite", which shows that one has to be careful with such interpretations. 
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